Closed Bug 783512 Opened 12 years ago Closed 6 years ago

Event "ended" does not fire when audio tag uses a MP3 as src

Categories

(Firefox OS Graveyard :: General, defect, P1)

ARM
Gonk (Firefox OS)
defect

Tracking

(blocking-kilimanjaro:+, blocking-basecamp:-)

RESOLVED WONTFIX
blocking-kilimanjaro +
blocking-basecamp -

People

(Reporter: dkuo, Unassigned)

References

Details

(Whiteboard: [LOE:S][qa+])

Attachments

(2 files)

User Agent: Mozilla/5.0 (Mobile; rv:17.0) Gecko/17.0 Firefox/17.0

Steps to reproduce:

  1. Add a event listener to an audio tag:

    this.audio.addEventListener('ended', this);

  2. This is how I created the src:

    var url = URL.createObjectURL(file);
    this.audio.src = url;

  3. Handle the ended event in a switch case:

    case 'ended':
      this.next();
      break;

Actual results:

  The ended event did not fire since this.next() was never been called.
Dominic, does this bug block any work you're doing on Music app for V1?
No, it does not block what I am doing now, but blocks the auto-play next song functionality.
Drivers: Marking blocking, as we cannot play more than one song in a row without this fixed.
blocking-basecamp: --- → +
blocking-kilimanjaro: --- → +
cpearce, is this something you can work on?  If not, can you suggest someone else?  Thanks.
Assignee: nobody → cpearce
Assignee: cpearce → dhylands
Whiteboard: [WebAPI:P0]
My analysis has found the following. There is a reader thread which reads the mp3 audio source. In my case it was getting almost all of the way through the mp3 stream and then OmxDecoder::ReleaseAudioBuffer is called and never returns. It turns out its blocked waiting on a mutex. Here's a backtrace of that thread:

#0  __futex_syscall3 () at bionic/libc/arch-arm/bionic/atomics_arm.S:182
#1  0x400d9514 in _normal_lock (mutex=0x47ca93b0) at bionic/libc/bionic/pthread.c:951
#2  pthread_mutex_lock (mutex=0x47ca93b0) at bionic/libc/bionic/pthread.c:1041
#3  0x4f27a75c in android::Mutex::lock (this=0x47ca9300, buffer=0x80) at frameworks/base/include/utils/threads.h:289
#4  Autolock (this=0x47ca9300, buffer=0x80) at frameworks/base/include/utils/threads.h:244
#5  android::OMXCodec::signalBufferReturned (this=0x47ca9300, buffer=0x80) at frameworks/base/media/libstagefright/OMXCodec.cpp:5357
#6  0x4f27115e in android::MediaBuffer::release (this=0x44140380) at frameworks/base/media/libstagefright/MediaBuffer.cpp:102
#7  0x44593620 in OmxDecoder::ReleaseAudioBuffer (this=0x444efb00) at /home/work/B2G-otoro/gecko-git/media/omx-plugin/OmxPlugin.cpp:438
#8  0x44593678 in OmxDecoder::ReadAudio (this=0x444efb00, aFrame=0x473fcd10, aSeekTimeUs=-1) at /home/work/B2G-otoro/gecko-git/media/omx-plugin/OmxPlugin.cpp:592
#9  0x445937be in ReadAudio (aDecoder=<value optimized out>, aFrame=0x80, aSeekTimeUs=2) at /home/work/B2G-otoro/gecko-git/media/omx-plugin/OmxPlugin.cpp:663
#10 0x40fa204c in nsMediaPluginReader::DecodeAudioData (this=0x467c2700) at /home/work/B2G-otoro/gecko-git/content/media/plugins/nsMediaPluginReader.cpp:257
#11 0x40f8a642 in nsBuiltinDecoderStateMachine::DecodeLoop (this=0x4869d140) at /home/work/B2G-otoro/gecko-git/content/media/nsBuiltinDecoderStateMachine.cpp:883
#12 0x40f8aed4 in nsBuiltinDecoderStateMachine::DecodeThreadRun (this=0x4869d140) at /home/work/B2G-otoro/gecko-git/content/media/nsBuiltinDecoderStateMachine.cpp:491
#13 0x40837cf6 in nsRunnableMethodImpl<void (nsUpdateProcessor::*)(), true>::Run (this=<value optimized out>) at ../../dist/include/nsThreadUtils.h:349
#14 0x4153323a in nsThread::ProcessNextEvent (this=0x43aed2b0, mayWait=<value optimized out>, result=0x473fce8f) at /home/work/B2G-otoro/gecko-git/xpcom/threads/nsThread.cpp:624
#15 0x414fce90 in NS_ProcessNextEvent_P (thread=0x47ca93b0, mayWait=true) at /home/work/B2G-otoro/objdir-gecko-debug-git/xpcom/build/nsThreadUtils.cpp:220
#16 0x41533864 in nsThread::ThreadFunc (arg=<value optimized out>) at /home/work/B2G-otoro/gecko-git/xpcom/threads/nsThread.cpp:257
#17 0x40198644 in _pt_root (arg=<value optimized out>) at /home/work/B2G-otoro/gecko-git/nsprpub/pr/src/pthreads/ptthread.c:156
#18 0x400da0d8 in __thread_entry (func=0x401985dd <_pt_root>, arg=0x43aed390, tls=<value optimized out>) at bionic/libc/bionic/pthread.c:217
#19 0x400d9c2c in pthread_create (thread_out=<value optimized out>, attr=0x4d9ffbe4, start_routine=0x401985dd <_pt_root>, arg=0x43aed390) at bionic/libc/bionic/pthread.c:357
#20 0x00000000 in ?? ()

So I enabled some debug stuff in the android mutex library and determined that the following thread is holding the mutex:

#0  android::MediaStreamSource::readAt (this=0x488723a0, offset=50816, data=0x4f601820, size=1024) at /home/work/B2G-otoro/gecko-git/media/omx-plugin/OmxPlugin.cpp:83
#1  0x4f265d62 in Resync (source=..., match_header=<value optimized out>, inout_pos=0x4f601c78, post_id3_pos=<value optimized out>, out_header=0x0)
    at frameworks/base/media/libstagefright/MP3Extractor.cpp:126
#2  0x4f26635a in android::MP3Source::read (this=0x480bf940, out=0x4f601d68, options=<value optimized out>) at frameworks/base/media/libstagefright/MP3Extractor.cpp:489
#3  0x4f277b84 in android::OMXCodec::drainInputBuffer (this=0x4c3f46a0, info=0x48ed9258) at frameworks/base/media/libstagefright/OMXCodec.cpp:4141
#4  0x4f27b2ea in android::OMXCodec::on_message (this=0x4c3f46a0, msg=<value optimized out>) at frameworks/base/media/libstagefright/OMXCodec.cpp:3073
#5  0x4f27b820 in android::OMXCodecObserver::onMessage (this=<value optimized out>, msg=...) at frameworks/base/media/libstagefright/OMXCodec.cpp:328
#6  0x464a4cb6 in android::OMXNodeInstance::onMessage (this=0x47dfcd00, msg=...) at frameworks/base/media/libstagefright/omx/OMXNodeInstance.cpp:679
#7  0x464a3e3e in android::OMX::CallbackDispatcher::dispatch (this=<value optimized out>, msg=...) at frameworks/base/media/libstagefright/omx/OMX.cpp:125
#8  0x464a3eb6 in android::OMX::CallbackDispatcher::loop (this=0x45841e80) at frameworks/base/media/libstagefright/omx/OMX.cpp:146
#9  0x464a3ec0 in android::OMX::CallbackDispatcherThread::threadLoop (this=<value optimized out>) at frameworks/base/media/libstagefright/omx/OMX.cpp:155
#10 0x4222ff20 in android::Thread::_threadLoop (user=<value optimized out>) at frameworks/base/libs/utils/Threads.cpp:834
#11 0x42230586 in thread_data_t::trampoline (t=<value optimized out>) at frameworks/base/libs/utils/Threads.cpp:127
#12 0x400fd0d8 in __thread_entry (func=0x422304f1 <thread_data_t::trampoline(thread_data_t const*)>, arg=0x47d02730, tls=<value optimized out>) at bionic/libc/bionic/pthread.c:217
#13 0x400fcc2c in pthread_create (thread_out=<value optimized out>, attr=0x464cca2c, start_routine=0x422304f1 <thread_data_t::trampoline(thread_data_t const*)>, arg=0x47d02730)
    at bionic/libc/bionic/pthread.c:357
#14 0x00000000 in ?? ()

The OMXCodec::on_message function is holding the mutex. The readAt function is spinning with a non-zero todo value, but getting zero bytesRead (because the thread which is going to put more bytes in the queue is blocked trying to release a buffer).

I figured I'd let somebody who is more familiar with the code suggest a fix. One possible fix would be to release and reacquire the mutex if readAt gets bytesRead of zero. drainInputBuffer is probably doing too much work while holding the mutex.
The MP3 file I'm using has both ID3v1 and ID3v2 headers. The ID3v1 header is actually stored int the last 128 bytes of the file, and what's happening is that our MediaStreamSource is trying to feed the contents of the ID3v1 header as MP3 encoded data, which is what causes all of this.

So we need someplace to detect that the ID3v1 header is present (it starts with the string "TAG") and reduce the size of the media resource by 128 bytes.
Hah!  Nice debugging work.

Media folks: what's the right way to do this?
This sounds like a bug in libstagefright. We still need to handle it obviously.

Edwin, doublec or cpeterson would be the best people to answer this, Edwin's on PTO today however.
If someone provides an example MP3 file that shows the issue I'll look into it tomorrow.
This is a small (7 second) clip which I was using to reproduce the problem.

It has ID3v1, ID3v2, Xing, and Lame headers in it.
Ok, thanks. Do you know if the same issue occurs on Android?
On my Samsung Galaxy Tab 10.1 running ICS, The "Music Player" app plays it back properly (and continues on the next song right after its finished).

My phone that I normally use for debugging these things (SGS2) is dead and I'm waiting for a replacement.

In more detail, the frameworks/base/media/libstagefright/MP3Extractor.cpp file, in the MP3Source::read function has a log that prints out:

lost sync! header = 0x54414700, old header = 0xfffb90c4

The 54414700 is the letters TAG. 

That particular log message is disabled by default, so you need to rebuild android to enable it.
Whiteboard: [WebAPI:P0] → [WebAPI:P0][LOE:S]
I've implemented a Gaia-based workaround in https://github.com/mozilla-b2g/gaia/pull/5055, so this probably doesn't have to block basecamp
Dhylands, are there any other reasons to block on fixing the underlying platform issue for V1?
My only concern is that when things go wrong you wind up with one thread blocked waiting on a semaphore. As long as that portion cleans itself up then we're fine.

So I guess my concern is that we may still have a race where things might hang.
Thanks, how do we verify that?
If we increase the timeout in the Music player so that it goes off a few seconds after the end of the song (rather than at the end of the song) and verify that the mp3 file has both an IDv1 and IDv2 tags, and everything continues to work for several songs in a row, then I think we'd be good.

We need to test on the otoro.
For what its worth, the timer workaround in the Music app waits until it is within one second of the end of the song, and then sets a timer for one second after the end of the song.  If the timer is set, then it ignores any ended event it does get.  (During this brief window when the timers is set, the Music app's seek controls won't work right, but I've ignored that as an unlikely corner case.)
Priority: -- → P1
Whiteboard: [WebAPI:P0][LOE:S] → [LOE:S]
Whiteboard: [LOE:S] → [LOE:S][qa+]
I don't think that this bug needs to block basecamp anymore since the workaround seems to be working.
blocking-basecamp: + → ?
Not a blocker based on comment #19.
blocking-basecamp: ? → -
I seem to be having the same issue with not getting 'ended' events when trying to play the videos recorded by the camera.  I'm going to be using the same workaround when playing videos in the gallery, but wanted to mention it here in case it is the same underlying bug affecting both the audio and video elements
Note that the workaround cause troubles like bug 833517. Maybe it's time to pick up this bug again on platform to trim the weight of Gaia.
Assignee: dhylands → nobody
I can confirm that we have the same problem in Android 4.2.2 on the Galaxy Nexus with Firefox Beta (v19) for Android.
Attached file Github pull request
Hi,

The patch basically just removes the workaround that has been added earlier. Bug 831224, which fixes the underlying issue, is ready to land as soon as b2g-inbound is open again. Maybe this patch can already be reviewed meanwhile.
Attachment #786993 - Flags: review?(dflanagan)
Assignee: nobody → tzimmermann
Status: NEW → ASSIGNED
Updated the pull request.
ping (for the review)
Comment on attachment 786993 [details]
Github pull request

Thomas,

What an epic bug fix you did in bug 831224. Thank you for following through and circling back to this one.

The code looks fine, but I'm giving r- here because of the discrepancy between your description of what bug 831224 fixes and what Dave Hylands diagnosed here in comment #6.

It doesn't sound like your patch in 831224 fixes the issue that Dave identified in comment 6. Have you tested with the test file that Dave attached to this bug? If it works, do you know what bug # fixed it?

If it doesn't work, then obviously, I don't want to take out the timer code yet.  In that case, I'd love a pull request with just the durationchange event handler.
Attachment #786993 - Flags: review?(dflanagan) → review-
Hi  David,

Quite a while ago, I tested this patch with the attached file, the file from bug 832147, and my own music collection, to which I refer in bug 831224. They all worked correctly and the ended event was received in all cases.

Because you asked for it, I tested again to compare with an older version that doesn't have the patches from bug 831224. However I cannot reproduce the issue with the older version. Maybe something else changed meanwhile that fixed this bug. I'll try to find the revision where the problem disappears.
I'd like to get rid of the workaround, if possible. So I started searching for an old version where this problem was fixed. I applied the patch to Gaia and dumped the received events into the logcat. My testing was done with the attached MP3 file. The oldest version of Gecko that I could still get to work was rev ca457da3604a, committed Apr 03. Even in that version, I was unable to reproduce the original problem. Hopefully this allows us to remove the workaround from the Music player.

Dave, could you verify if this problem still persists or has been fixed? Thanks!
Flags: needinfo?(dhylands)
I looked at frameworks/base/media/libstagefright/MP3Extractor.cpp in the MP3Source::read function.

I changed line 489 to use LOGE instead of LOGV and ran the music app.

I saw the following 2 messages in logcat:

lost sync! header = 0x54414700, old header = 0xfffb90c4
Unable to resync. Signalling end of stream.

So the MP3 extractor is still trying to parse the IDv3 header as music.

If I apply the patch from https://github.com/mozilla-b2g/gaia/pull/11402 then I observed a hang at the end of playing the attached mp3 file (I was using a master build on my unagi).

Without applying the patch, then the attached mp3 file played and the next file after it played as well.
Flags: needinfo?(dhylands)
Thanks for testing again! It's strange that I cannot reproduce the issue. I tested with an unagi. What device do you use? Maybe it's fixed in some implementations of libstagefright.
(In reply to Thomas Zimmermann [:tzimmermann] [:tdz] from comment #31)
> Thanks for testing again! It's strange that I cannot reproduce the issue. I
> tested with an unagi. What device do you use? Maybe it's fixed in some
> implementations of libstagefright.

I'm also using an unagi.

My gaia tree is at: 0035590eefc1ac6cfc05cfa868ed646f9f545e0c (master)
I'm using b2g-inbound for gecko, and its at: 144322:1fe485c4dc42
This corresponds roughly to 7dc9265ed90037fbd273a05d322637931191d87a in the gecko tree, plus a few landings of other bugs.
Oh yeah - I was also using a B2G_DEBUG=1 build. Maybe that changes the timing of things...
Hmm, not much difference from my setup, including B2G_DEBUG. :/ I'll try to reproduce what you describe in Comment 30.
I can't reproduce the problem. The git commit hash in frameworks/base is eb2bc75803ca179353c24c364a9c8a8ce23e8b78
(In reply to Thomas Zimmermann [:tzimmermann] [:tdz] from comment #35)
> I can't reproduce the problem. The git commit hash in frameworks/base is
> eb2bc75803ca179353c24c364a9c8a8ce23e8b78

That's what my hash is as well.

Let me try again (it may take a couple of days) and I can send you my unagi image, and my set of files that I have on my sdcard.
Does this bug still happen? I'm wondering if bug 915958 fixed this.
I commented out the creation of endedTimer in gaia/apps/Music/js/Player.js and then tried to reproduce this today (I tried VARIANT=userdebug, B2G_DEBUG=0/1 and VARIANT=eng, B2G_DEBUG=1) and wasn't able to.

I then backed out bug 915958, but still wasn't able to reproduce it, so I can't say conclusively whether bug 915958 fixed it or not.

My gut was that this was a race of some kind, so other things in the system make be impacting this.
Haven't worked on this for ages...
Assignee: tzimmermann → nobody
Firefox OS is not being worked on
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: