I'm gonna say it straight off, I'm no expert. But the first part of your question seems relatively straightforward to answer. The easiest is by building test facilities at different altitudes, get the performance data you're after at each altitude you can, and then extrapolate for whichever atmospheric pressure you later require.
E.g. ISRO has a high-altitude test facility at Mahendragiri, India (1,654 m / 5,427 ft), JAXA can simulate atmospheric conditions of an altitude of approximately 30 km at their High Altitude for Rocket Engine Test Facility, SpaceX is leasing a launch pad in Las Cruces, New Mexico (4,000 ft / 1,219 m), DLR has an Altitude Simulation Test facility in Lampoldshausen, and so on.
The thrust itself is usually measured by load cells, which have multiple strain gauges oriented at different angles (usually four of them, two on each side oriented perpendicular to each other) and convert deformations to them (strain / load) into electric signal. Placing multiple load cells to measure force loads on the rigid rocket engine frame it is mounted to during test-fire should be sufficient to measure the thrust vector, since you can measure forces each side of the frame is a subject to.
The exhaust plume itself is usually measured with multiple infrared cameras that record its heat signature from various angles in real-time. All this data would be combined with data from a variety of sensors built into the rocket engine itself, such as e.g. measuring propellants pressure in its injectors (injection rate), and so on.