Viewed   81 times

I have an app in which I use an AudioTrack in streaming mode to play dynamically generated audio. The app doesn't have to respond instantaneously to inputs, so the latency issues don't bother me for that side of the program.

The problem is that I have an animation that needs to be as precisely 'in-sync' as possible with the audio and it seems that different devices have different amounts of time between when the AudioTrack stops blocking the write() call and asks for more data, and when that audio is played from the speaker.

My current solution gets me most of the way there -- I count the number of frames I've passed in to the AudioTrack so far, and compare it to getPlaybackHeadPosition(). It looks basically like:

long currentTimeInFrames = 0;
while(playingAudio) {
  currentTimeInFrames += numberOfFramesToWrite;
  long delayInFrames = (currentTimeInFrames - audioTrack.getPlaybackHeadPosition());

However, there's still some latency that getPlaybackHeadPosition() doesn't seem to account for that varies by device.

Is there a way to poll the system for the latency of the AudioTrack?



Consider driver's latency. There's hidden function AudioManager.getOutputLatency(int) to get this.

Call it like this:

AudioManager am = (AudioManager)getSystemService(Context.AUDIO_SERVICE);
   Method m = am.getClass().getMethod("getOutputLatency", int.class);
   latency = (Integer)m.invoke(am, AudioManager.STREAM_MUSIC);
}catch(Exception e){

I get about 45 - 50 ms on different devices. Use the result in your calculations.

Saturday, September 10, 2022

You can use gson.jar to store class objects into SharedPreferences. You can download this jar from google-gson

Or add the GSON dependency in your Gradle file:

implementation ''

Creating a shared preference:

SharedPreferences  mPrefs = getPreferences(MODE_PRIVATE);

To save:

MyObject myObject = new MyObject;
//set variables of 'myObject', etc.

Editor prefsEditor = mPrefs.edit();
Gson gson = new Gson();
String json = gson.toJson(myObject);
prefsEditor.putString("MyObject", json);

To retrieve:

Gson gson = new Gson();
String json = mPrefs.getString("MyObject", "");
MyObject obj = gson.fromJson(json, MyObject.class);
Monday, September 19, 2022

Like user harikris suggests, I would highly recommend you move all your audio playback and processing code to Android NDK using the OpenSL ES library for the best performance.

As I understand, the AudioTrack API is built on top of OpenSL ES Buffer Queue Audio Player. So you could probably improve performance by working directly with the NDK, writing C code that is called from your Java/Android layer to work with the sound.

The native-audio example mentioned above contains code that will show you how to play a sound file directly from a URI. In my experience, the results from this method are better than AudioTrack in Static Mode.

Soundpool is generally reserved for very short sounds that can be played from memory and its not a scalable solution for your sequencer especially if you introduce large files.

Here are some links that have helped me out with my applications: -General Information on OpenSL ES for Android:

-An Android audio blog with some great example code:

Edit: The old mobile pearl link appears to be down. Here's a working one:

Wednesday, December 14, 2022

I am not familiar with android audio, so I can't answer all your questions, but I can tell you what the fundamental problem is: adding audio data byte-by-byte won't work. Since it sort-of works, and from looking at your code, and the fact that it's most common, I'm going to assume you have 16-bit PCM data. Yet everywhere, you are dealing with bytes. Bytes are not appropriate for processing audio (unless the audio happens to be 8-bit)

Bytes are aprox +/- 128. You say "I would expect to see values ranging from −32768 to 32767 in logcat from the Log.d(...) invocation above, but instead the results tend to be within the range of -100 to 100 (with some outliers beyond that)" Well, how could you possibly go to that range when you are printing values from a byte array? The correct datatype for 16 bit signed data is short, not byte. If you were printing short values, you'd see the range you expected.

You must convert your bytes to shorts and sum the shorts. This will take care of much of the misc noise you are hearing. Since you are reading right off the file, though, why bother converting? why not read it off the file as a short using something like this

The next issue is that you must deal with out-of-range values, rather than letting them "wrap around". The simplest solution is simply to do the summing as integers, "clip" into the short range, and then store the clipped output. This will get rid of your clicks and pops.

In psuedo-code, the entire process will look something like this:

file1 = Open file 1
file2 = Open file 2
output = Open output for writing

numSampleFrames1 = file1.readHeader()
numSampleFrames2 = file2.readHeader()
numSampleFrames = min( numSampleFrames1, numSampleFrames2 )
output.createHeader( numSampleFrames )

for( int i=0; i<numSampleFrames * channels; ++i ) {
    //read data from file 1
    int a = file1.readShort();
    //read data from file 2, and add it to data we read from file 1
    a += file2.readShort();
    //clip into range
    if( a > Short.MAX_VALUE )
       a = Short.MAX_VALUE;
    if( a < Short.MIN_VALUE )
       a = Short.MIN_VALUE;
    //write it to the output
    output.writeShort( (Short) a );

You will get a little distortion from the "clipping" step, but there's no simple way around that, and clipping is MUCH better than wrap-around. (that said, unless your tracks are extremely "hot", and heavy in the low frequencies, the distortion shouldn't be too noticeable. If it is a problem, you can do other things: multiply a by .5 for example and skip the clipping, but then your output will be much quieter, which, on a phone, is probably not what you want).

Thursday, August 18, 2022

This works for me to set the input volume on my MacBook Pro (2011 model). It is a bit funky, I had to try setting the master channel volume, then each independent stereo channel volume until I found one that worked. Look through the comments in my code, I suspect the best way to tell if your code is working is to find a get/set-property combination that works, then do something like get/set (something else)/get to verify that your setter is working.

Oh, and I'll point out of course that I wouldn't rely on the values in address staying the same across getProperty calls as I'm doing here. It seems to work but it's definitely bad practice to rely on struct values being the same when you pass one by reference to a function. This is of course sample code so please forgive my laziness. ;)

//  main.c
//  testInputVolumeSetter

#include <CoreFoundation/CoreFoundation.h>
#include <CoreAudio/CoreAudio.h>

OSStatus setDefaultInputDeviceVolume( Float32 toVolume );

int main(int argc, const char * argv[]) {
    OSStatus                        err;

    // Load the Sound system preference, select a default
    // input device, set its volume to max.  Now set
    // breakpoints at each of these lines.  As you step over
    // them you'll see the input volume change in the Sound
    // preference panel.
    // On my MacBook Pro setting the channel[ 1 ] volume
    // on the default microphone input device seems to do
    // the trick.  channel[ 0 ] reports that it works but
    // seems to have no effect and the master channel is
    // unsettable.
    // I do not know how to tell which one will work so
    // probably the best thing to do is write your code
    // to call getProperty after you call setProperty to
    // determine which channel(s) work.
    err = setDefaultInputDeviceVolume( 0.0 );
    err = setDefaultInputDeviceVolume( 0.5 );
    err = setDefaultInputDeviceVolume( 1.0 );

// 0.0 == no volume, 1.0 == max volume
OSStatus setDefaultInputDeviceVolume( Float32 toVolume ) {
    AudioObjectPropertyAddress      address;
    AudioDeviceID                   deviceID;
    OSStatus                        err;
    UInt32                          size;
    UInt32                          channels[ 2 ];
    Float32                         volume;

    // get the default input device id
    address.mSelector = kAudioHardwarePropertyDefaultInputDevice;
    address.mScope = kAudioObjectPropertyScopeGlobal;
    address.mElement = kAudioObjectPropertyElementMaster;

    size = sizeof(deviceID);
    err = AudioObjectGetPropertyData( kAudioObjectSystemObject, &address, 0, nil, &size, &deviceID );

    // get the input device stereo channels
    if ( ! err ) {
        address.mSelector = kAudioDevicePropertyPreferredChannelsForStereo;
        address.mScope = kAudioDevicePropertyScopeInput;
        address.mElement = kAudioObjectPropertyElementWildcard;
        size = sizeof(channels);
        err = AudioObjectGetPropertyData( deviceID, &address, 0, nil, &size, &channels );

    // run some tests to see what channels might respond to volume changes
    if ( ! err ) {
        Boolean                     hasProperty;

        address.mSelector = kAudioDevicePropertyVolumeScalar;
        address.mScope = kAudioDevicePropertyScopeInput;

        // On my MacBook Pro using the default microphone input:

        address.mElement = kAudioObjectPropertyElementMaster;
        // returns false, no VolumeScalar property for the master channel
        hasProperty = AudioObjectHasProperty( deviceID, &address );

        address.mElement = channels[ 0 ];
        // returns true, channel 0 has a VolumeScalar property
        hasProperty = AudioObjectHasProperty( deviceID, &address );

        address.mElement = channels[ 1 ];
        // returns true, channel 1 has a VolumeScalar property
        hasProperty = AudioObjectHasProperty( deviceID, &address );

    // try to get the input volume
    if ( ! err ) {
        address.mSelector = kAudioDevicePropertyVolumeScalar;
        address.mScope = kAudioDevicePropertyScopeInput;

        size = sizeof(volume);
        address.mElement = kAudioObjectPropertyElementMaster;
        // returns an error which we expect since it reported not having the property
        err = AudioObjectGetPropertyData( deviceID, &address, 0, nil, &size, &volume );

        size = sizeof(volume);
        address.mElement = channels[ 0 ];
        // returns noErr, but says the volume is always zero (weird)
        err = AudioObjectGetPropertyData( deviceID, &address, 0, nil, &size, &volume );

        size = sizeof(volume);
        address.mElement = channels[ 1 ];
        // returns noErr, but returns the correct volume!
        err = AudioObjectGetPropertyData( deviceID, &address, 0, nil, &size, &volume );

    // try to set the input volume
    if ( ! err ) {
        address.mSelector = kAudioDevicePropertyVolumeScalar;
        address.mScope = kAudioDevicePropertyScopeInput;

        size = sizeof(volume);

        if ( toVolume < 0.0 ) volume = 0.0;
        else if ( toVolume > 1.0 ) volume = 1.0;
        else volume = toVolume;

        address.mElement = kAudioObjectPropertyElementMaster;
        // returns an error which we expect since it reported not having the property
        err = AudioObjectSetPropertyData( deviceID, &address, 0, nil, size, &volume );

        address.mElement = channels[ 0 ];
        // returns noErr, but doesn't affect my input gain
        err = AudioObjectSetPropertyData( deviceID, &address, 0, nil, size, &volume );

        address.mElement = channels[ 1 ];
        // success! correctly sets the input device volume.
        err = AudioObjectSetPropertyData( deviceID, &address, 0, nil, size, &volume );

    return err;

EDIT in response to your question, "How'd [I] figure this out?"

I've spent a lot of time using Apple's audio code over the last five or so years and I've developed some intuition/process when it comes to where and how to look for solutions. My business partner and I co-wrote the original iHeartRadio apps for the first-generation iPhone and a few other devices and one of my responsibilities on that project was the audio portion, specifically writing an AAC Shoutcast stream decoder/player for iOS. There weren't any docs or open-source examples at the time so it involved a lot of trial-and-error and I learned a ton.

At any rate, when I read your question and saw the bounty I figured this was just low-hanging fruit (i.e. you hadn't RTFM ;-). I wrote a few lines of code to set the volume property and when that didn't work I genuinely got interested.

Process-wise maybe you'll find this useful:

Once I knew it wasn't a straightforward answer I started thinking about how to solve the problem. I knew the Sound System Preference lets you set the input gain so I started by disassembling it with otool to see whether Apple was making use of old or new Audio Toolbox routines (new as it happens):

Try using:

otool -tV /System/Library/PreferencePanes/Sound.prefPane/Contents/MacOS/Sound | bbedit

then search for Audio to see what methods are called (if you don't have bbedit, which every Mac developer should IMO, dump it to a file and open in some other text editor).

I'm most familiar with the older, deprecated Audio Toolbox routines (three years to obsolescence in this industry) so I looked at some Technotes from Apple. They have one that shows how to get the default input device and set its volume using the newest CoreAudio methods but as you undoubtedly saw their code doesn't work properly (at least on my MBP).

Once I got to that point I fell back on the tried-and-true: Start googling for keywords that were likely to be involved (e.g. AudioObjectSetPropertyData, kAudioDevicePropertyVolumeScalar, etc.) looking for example usage.

One interesting thing I've found about CoreAudio and using the Apple Toolbox in general is that there is a lot of open-source code out there where people try various things (tons of pastebins and GoogleCode projects etc.). If you're willing to dig through a bunch of this code you'll typically either find the answer outright or get some very good ideas.

In my search the most relevant things I found were the Apple technote showing how to get the default input device and set the master input gain using the new Toolbox routines (even though it didn't work on my hardware), and I found some code that showed setting the gain by channel on an output device. Since input devices can be multichannel I figured this was the next logical thing to try.

Your question is really good because at least right now there is no correct documentation from Apple that shows how to do what you asked. It's goofy too because both channels report that they set the volume but clearly only one of them does (the input mic is a mono source so this isn't surprising, but I consider having a no-op channel and no documentation about it a bit of a bug on Apple).

This happens pretty consistently when you start dealing with Apple's cutting-edge technologies. You can do amazing things with their toolbox and it blows every other OS I've worked on out of the water but it doesn't take too long to get ahead of their documentation, particularly if you're trying to do anything moderately sophisticated.

If you ever decide to write a kernel driver for example you'll find the documentation on IOKit to be woefully inadequate. Ultimately you've got to get online and dig through source code, either other people's projects or the OS X source or both, and pretty soon you'll conclude as I have that the source is really the best place for answers (even though is pretty awesome).

Thanks for the points and good luck with your project :)

Thursday, August 25, 2022
Only authorized users can answer the search term. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :