- However I cannot measure such a time in the interface: both in the original and in the slowed down version there is no clear point where the noise or the sound "begin", the attack is so "blurred" that I could measure anything between 2ms and 100ms (and that is on the same recording!)
So here I am with something I can hear and measure and see on the screen, yet I cannot quantify.
For this particular question, you can do the following:
- Adjust DP volume to become the loudest sound source
- Record DP sound plus action noise plus ambient noise with a microphone (as you did)
- Simultanously record line-out from the DP (void of action/ambient sound)
- Normalize and correlate the two signals to find the best match. This indicates the time-relationship between the two recordings.
- Find the first indication of generated sound in the line-out recording. This indicates when the DP "knew" that a key has been pressed.
- Find the first indication of action noise in the microphone recording. It won't be buried below the piano sound because it's earlier.
- Possibly compensate for different sound travel time from action/piano to your microphone.
To be more exact about the action noise detection, you can create multiple recordings of the action noise and "average" them to create a synthetic ideal sound patch (once). Position the microphone to reduce noise that you repeatedly create with your fingers/body/bench. Or use different playing techniques for different samples so that unavoidable noise will average-away in the final result. Correlate this ideal reference sound patch with your recordings to extract the action timing.
You will probably get problems due to overlap between action and piano sound. You can trim the action reference patch to be shorter then the expected latency. Then there is no overlap. Likewise you can trim (all) line-out inputs by an arbitrary amount (from the easy to identify sound) and correlate only the remainder with the microphone input. For this, consider the unique sample length of your DP and when it starts looping the same waveform over and over (expect 1-4 seconds, or consult the corresponding megathread here on PW). Also, as recommended at the beginning, adjusting DP volume to be high probably makes this unnecessary.
To measure acoustic instruments, you can use a visual approach. Open the instrument and position a camera to capture both finger/key as well as hammer/string in a single frame. Capture a clock for a minute to make sure your nominal framerate is acceptable. Wobble the camera during a test capture to make sure it produces no tearing. If there is, the lines that compose the image are not being captured at the same instant of time. In this case, rotate the camera until all relevant components are aligned horizontally in the output (ideally all are visible in one single line). Record your data, then analyze the frames to count the time from touch to string.
Hammer travel time depends on velocity. You can use a player piano (pneumatic self-playing pianos like seen in western movies) to generate consistent velocities. Alternatively you can use a piano with silent system and MIDI out to read the actual velocity and reject recordings that fall outside of your target range. Lacking all, you may need to construct a device to get repeatable strikes (youtube has examples).
DPs also adapt to velocity, although probably less pronounced. The audio "peaks" may be less aggressive, or artificial delay may be introduced. For fair comparisons you should also control velocity on DPs.
I know this response is very pedantic, but you asked for it didn't you?