This was causing some very large segments to be produced if the
input had some weird characteristics like missing timestamps.
Co-authored-by: Marco van Dijk <marco@stronk.rocks>
* add duration check for input file to ensure not transcoding long inputs and protect against inputs that have time stamp anomalies causing the output to be much longer than the input
---------
Co-authored-by: Josh Allmann <joshua.allmann@gmail.com>
Fixes a number of things including a LPMS crash, choppy video
quality, green screens during rotation, inconsistent frame counts
vs software decoding, etc. We also apparently gained GPU
support for MPEG2 decoding.
This is a massive change: we can no longer add outputs up front
due to the ffmpeg hwaccel API, so we have to wait until we receive
a decoded video frame in order to add outputs. This also means
properly queuing up audio and draining things in the same order.
This adds demuxer options as a complement to the existing encoder/muxer
options which allows us to:
1. explicitly select the demuxer to use if probing doesn't return a good result
2. configure the demuxer with additional options
This has come up a few times while looking at various things so it is good to
have an API that is fully configurable out of the box.
This allows the transcoded resolution to be re-clamped
correctly if the input resolution changes mid-segment.
As a result, we no longer need to do this clamping in golang.
Additionally, make the behavior between GPU and CPU more consistent
by applying nvidia codec limits and clamping CPU transcodes.
Also adds a bunch of other changes necessary to better support
mid-stream resolution changes.
Unfortunately with CUVID there still seems to be a brief flash of
green (looks to be the length of the decoder's internal frame buffer)
but we can tackle that separately. This PR simply makes the transcoder
1. not crash, and
2. correctly encode mid-stream rotations, including with CPUs
This usually happens with CUVID if the decoder needs to be reset
internally for whatever reason, such as a mid-stream resolution
change.
Also block demuxing until decoder is ready to receive packets again.
Also add another condition for re-initialization: if the
input resolution changes. This triggers the filter graph
to re-build and adjust to the new resolution, when CPU
encoders are in use.
This mostly ensures that non-B frames have the same dts/pts.
The PTS/DTS from the encoder can be "squashed" a bit during rescaling
back to the source timebase if it is used directly, due to the lower
resolution of the encoder timebase. We avoid this problem with the
PTS in in FPS passthrough mode by reusing the source pts, but only
rescale the encoder-provided DTS back to the source timebase for some
semblance of timestamp consistency. Because the DTS values are
squashed, they can differ from the PTS even with non-B frames.
The DTS values are still monotonic, so the exact numbers are not really
important. However, some tools use `dts == pts` as a heuristic to check
for B-frames ... so help them out to avoid spurious B-frame detections.
To fix the DTS/PTS mismatch, take the difference between the
encoder-provided dts/pts, rescale that difference back to the source
time base, and re-calculate the dts using the source pts.
Also see https://github.com/livepeer/lpms/pull/405
This commit ensure that developers are aware of the extra build flags
they have to run when installing ffmpeg for the tests to be succesfull.
It also cleans up the README a bit.
* Port install_ffmpeg.sh from go-livepeer
* Update ffmpeg and nv-codec-headers versions.
* Use local install_ffmpeg.sh in github CI
* Update transcoder for ffmpeg 7.0.1
* Update tests to be compatible with ffmpeg7 binary
* Fix FPS passthrough
* Set the encoder timebase using AVCodecContext.framerate instead of
the decoder's AVCodecContext.time_base.
The use of AVCodecContext.time_base is deprecated for decoding.
See https://ffmpeg.org/doxygen/3.3/structAVCodecContext.html#ab7bfeb9fa5840aac090e2b0bd0ef7589
* Adjust the packet timebase as necessary for FPS pass through
to match the encoder's expected timebase. For filtergraphs using
FPS adjustment, the filtergraph output timebase will match the
framerate (1 / framerate) and the encoder is configured for the same.
However, for FPS pass through, the filtergraph's output timebase
will match the input timebase (since there is no FPS adjustment)
while the encoder uses the timebase detected from the decoder's
framerate. Since the input timebase does not typically match the FPS
(eg 90khz for mpegts vs 30fps), we need to adjust the packet timestamps
(in container timebase) to the encoder's expected timebase.
* For the specific case of FPS passthrough, preserve the original PTS
as much as possible since we are trying to re-encode existing frames
one-to-one. Use the opaque field for this, since it is already being
populated with the original PTS to detect sentinel packets
during flushing.
Without this, timestamps can be slightly "squashed" down when
rescaling output packets to the muxer's timebase, due to the loss of
precision (eg, demuxer 90khz -> encoder 30hz -> muxer 90khz)
* Improve VFR support.
Manually calculate the duration of each frame and set
the PTS to that before submitting to the filtergraph.
This allows us to better support variable frame rates,
and is also better aligned with how ffmpeg does it.
This may change the number of frames output by the FPS
filter by +/- 1 frame. These aren't issues in themselves
but breaks a lot of test cases which will need to be updated.
* Update test cases for VFR.