Ruiling Song 
							
						 
					 
					
						
						
							
						
						98e419cbf5 
					 
					
						
						
							
							avfilter/vf_convolution: add x86 SIMD for filter_3x3()  
						
						... 
						
						
						
						Tested using a simple command (apply edge enhance):
./ffmpeg_g -i ~/Downloads/bbb_sunflower_1080p_30fps_normal.mp4 \
 -vf convolution="0 0 0 -1 1 0 0 0 0:0 0 0 -1 1 0 0 0 0:0 0 0 -1 1 0 0 0 0:0 0 0 -1 1 0 0 0 0:5:1:1:1:0:128:128:128" \
 -an -vframes 1000 -f null /dev/null
The fps increase from 151 to 270 on my local machine.
Signed-off-by: Ruiling Song <ruiling.song@intel.com > 
						
						
					 
					
						2019-08-07 14:31:28 +08:00 
						 
				 
			
				
					
						
							
							
								James Almer 
							
						 
					 
					
						
						
							
						
						b8f1542dcb 
					 
					
						
						
							
							avfilter/vf_gblur: add missing preprocessor check  
						
						... 
						
						
						
						Fixes compilation on x86_32
Signed-off-by: James Almer <jamrial@gmail.com > 
						
						
					 
					
						2019-06-12 10:54:59 -03:00 
						 
				 
			
				
					
						
							
							
								Ruiling Song 
							
						 
					 
					
						
						
							
						
						83f9da7768 
					 
					
						
						
							
							avfilter/vf_gblur: add x86 SIMD optimizations  
						
						... 
						
						
						
						The horizontal pass get ~2x performance with the patch
under single thread.
Tested overall performance using the command(avx2 enabled):
./ffmpeg -i 1080p.mp4 -vf gblur -f null /dev/null
./ffmpeg -i 1080p.mp4 -vf gblur=threads=1 -f null /dev/null
For single thread, the fps improves from 43 to 60, about 40%.
For multi-thread, the fps improves from 110 to 130, about 20%.
Signed-off-by: Ruiling Song <ruiling.song@intel.com > 
						
						
					 
					
						2019-06-12 08:53:11 +08:00 
						 
				 
			
				
					
						
							
							
								Paul B Mahol 
							
						 
					 
					
						
						
							
						
						dcae5ba322 
					 
					
						
						
							
							avfilter: add anlmdn filter x86 SIMD optimizations  
						
						
						
						
					 
					
						2019-01-10 21:49:47 +01:00 
						 
				 
			
				
					
						
							
							
								James Almer 
							
						 
					 
					
						
						
							
						
						ef67af31ff 
					 
					
						
						
							
							x86/af_afir: use three operand form forat some instructions  
						
						... 
						
						
						
						Fixes compilation with old yasm versions.
Signed-off-by: James Almer <jamrial@gmail.com > 
						
						
					 
					
						2019-01-03 23:36:19 -03:00 
						 
				 
			
				
					
						
							
							
								James Almer 
							
						 
					 
					
						
						
							
						
						5402c1886b 
					 
					
						
						
							
							x86/af_afir: add ff_fcmul_add_avx()  
						
						... 
						
						
						
						fcmul_add_c: 1228.8
fcmul_add_sse3: 334.3
fcmul_add_avx: 186.3
Tested on a Core i5 4460 @ 3.2GHz
Reviewed-by: Paul B Mahol <onemda@gmail.com >
Signed-off-by: James Almer <jamrial@gmail.com > 
						
						
					 
					
						2019-01-03 10:12:19 -03:00 
						 
				 
			
				
					
						
							
							
								James Almer 
							
						 
					 
					
						
						
							
						
						82043dfd2e 
					 
					
						
						
							
							avfilter/af_afir: split off fcmul_add into a DSP context  
						
						... 
						
						
						
						Reviewed-by: Paul B Mahol <onemda@gmail.com >
Signed-off-by: James Almer <jamrial@gmail.com > 
						
						
					 
					
						2019-01-03 10:12:18 -03:00 
						 
				 
			
				
					
						
							
							
								James Almer 
							
						 
					 
					
						
						
							
						
						9b5bd665e1 
					 
					
						
						
							
							x86/af_afir: fix processing the last element  
						
						... 
						
						
						
						ff_fcmul_add_sse3() is now identical to the C version.
Reviewed-by: Paul B Mahol <onemda@gmail.com >
Signed-off-by: James Almer <jamrial@gmail.com > 
						
						
					 
					
						2019-01-03 10:12:18 -03:00 
						 
				 
			
				
					
						
							
							
								James Almer 
							
						 
					 
					
						
						
							
						
						3913d6f734 
					 
					
						
						
							
							x86/scene_sad: fix link errors when HAVE_X86ASM is not defined  
						
						... 
						
						
						
						Reviewed-by: Haihao Xiang <haihao.xiang@intel.com >
Signed-off-by: James Almer <jamrial@gmail.com > 
						
						
					 
					
						2018-11-21 22:26:07 -03:00 
						 
				 
			
				
					
						
							
							
								Paul B Mahol 
							
						 
					 
					
						
						
							
						
						c98a32e4ad 
					 
					
						
						
							
							avfilter/vf_blend: add 10bit support  
						
						
						
						
					 
					
						2018-11-15 14:44:24 +01:00 
						 
				 
			
				
					
						
							
							
								Philip Langdale 
							
						 
					 
					
						
						
							
						
						1096614c42 
					 
					
						
						
							
							avfilter/vf_bwdif: Use common yadif frame management logic  
						
						... 
						
						
						
						After adding field type management to the common yadif logic, we can
remove the duplicate copy of that logic from bwdif. 
						
						
					 
					
						2018-11-14 17:41:01 -08:00 
						 
				 
			
				
					
						
							
							
								Marton Balint 
							
						 
					 
					
						
						
							
						
						6c2a7a8e9a 
					 
					
						
						
							
							avfilter/vf_framerate: factorize SAD functions which compute SAD for a whole frame  
						
						... 
						
						
						
						Also add SIMD which works on lines because it is faster then calculating it on
8x8 blocks using pixelutils.
Signed-off-by: Marton Balint <cus@passwd.hu > 
						
						
					 
					
						2018-11-11 20:30:50 +01:00 
						 
				 
			
				
					
						
							
							
								Paul B Mahol 
							
						 
					 
					
						
						
							
						
						0f0d468fbc 
					 
					
						
						
							
							avfilter/vf_overlay: exclude nv12/nv21 formats from x86 asm check  
						
						... 
						
						
						
						They are yet to be supported,
Signed-off-by: Paul B Mahol <onemda@gmail.com > 
						
						
					 
					
						2018-05-03 09:22:28 +02:00 
						 
				 
			
				
					
						
							
							
								Paul B Mahol 
							
						 
					 
					
						
						
							
						
						6d7c63588c 
					 
					
						
						
							
							avfilter/vf_overlay: add x86 SIMD  
						
						... 
						
						
						
						Specifically for yuv444, yuv422, yuv420 format when main stream has no alpha, and alpha
is straight.
Signed-off-by: Paul B Mahol <onemda@gmail.com > 
						
						
					 
					
						2018-05-02 23:58:21 +02:00 
						 
				 
			
				
					
						
							
							
								Vasile Toncu 
							
						 
					 
					
						
						
							
						
						9c01cdb94e 
					 
					
						
						
							
							avfilter/vf_interlace: remove duplicate code with same funcionality  
						
						
						
						
					 
					
						2018-04-23 23:48:30 +02:00 
						 
				 
			
				
					
						
							
							
								Martin Vignali 
							
						 
					 
					
						
						
							
						
						f3df42e81d 
					 
					
						
						
							
							avfilter/x86/vf_blend : add SIMD for 16 bit version of  
						
						... 
						
						
						
						grainextract
grainmerge
average
extremity
negation 
						
						
					 
					
						2018-04-05 21:46:16 +02:00 
						 
				 
			
				
					
						
							
							
								Martin Vignali 
							
						 
					 
					
						
						
							
						
						8eb0bb1108 
					 
					
						
						
							
							avfilter/x86/vf_blend : reorganize DIFFERENCE macro to reduce line duplication between 8bit and 16 bit version  
						
						
						
						
					 
					
						2018-04-05 21:46:11 +02:00 
						 
				 
			
				
					
						
							
							
								Martin Vignali 
							
						 
					 
					
						
						
							
						
						53a03b5c8c 
					 
					
						
						
							
							avfilter/x86/vf_blend : add 16 bit version for BLEND_SIMPLE, phoenix, difference for SSE and AVX2 (x86_64)  
						
						
						
						
					 
					
						2018-02-24 21:44:19 +01:00 
						 
				 
			
				
					
						
							
							
								Martin Vignali 
							
						 
					 
					
						
						
							
						
						6c6c9d14a8 
					 
					
						
						
							
							avfilter/x86/vf_blend : indent  
						
						
						
						
					 
					
						2018-02-24 21:44:16 +01:00 
						 
				 
			
				
					
						
							
							
								Martin Vignali 
							
						 
					 
					
						
						
							
						
						7590d58b61 
					 
					
						
						
							
							avfilter/x86/vf_blend : reorganize init in order to add 16 bit version  
						
						
						
						
					 
					
						2018-02-24 21:44:13 +01:00 
						 
				 
			
				
					
						
							
							
								Martin Vignali 
							
						 
					 
					
						
						
							
						
						3a230ce5fa 
					 
					
						
						
							
							avfilter/x86/vf_blend : avfilter/x86/vf_blend : add AVX2 version for each func except divide  
						
						... 
						
						
						
						and optimize average, grainextract, multiply, screen, grain merge 
						
						
					 
					
						2018-01-28 20:21:32 +01:00 
						 
				 
			
				
					
						
							
							
								Marton Balint 
							
						 
					 
					
						
						
							
						
						4d95c6d5d7 
					 
					
						
						
							
							avfilter/vf_framerate: add SIMD functions for frame blending  
						
						... 
						
						
						
						Blend function speedups on x86_64 Core i5 4460:
ffmpeg -f lavfi -i allyuv -vf framerate=60:threads=1 -f null none
C:     447548411 decicycles in Blend,    2048 runs,      0 skips
SSSE3: 130020087 decicycles in Blend,    2048 runs,      0 skips
AVX2:  128508221 decicycles in Blend,    2048 runs,      0 skips
ffmpeg -f lavfi -i allyuv -vf format=yuv420p12,framerate=60:threads=1 -f null none
C:     228932745 decicycles in Blend,    2048 runs,      0 skips
SSE4:  123357781 decicycles in Blend,    2048 runs,      0 skips
AVX2:  121215353 decicycles in Blend,    2048 runs,      0 skips
Signed-off-by: Marton Balint <cus@passwd.hu > 
						
						
					 
					
						2018-01-28 18:50:52 +01:00 
						 
				 
			
				
					
						
							
							
								Martin Vignali 
							
						 
					 
					
						
						
							
						
						b94cd55155 
					 
					
						
						
							
							avfilter/x86/vf_interlace : add AVX2 version  
						
						
						
						
					 
					
						2018-01-11 21:03:19 +01:00 
						 
				 
			
				
					
						
							
							
								James Almer 
							
						 
					 
					
						
						
							
						
						8e0e4384b0 
					 
					
						
						
							
							Revert "avfilter/vf_interlace : add AVX2 for lowpass_line 8 and 16"  
						
						... 
						
						
						
						This reverts commits 1a5865b6dc8fb1d63d91jamrial@gmail.com > 
						
						
					 
					
						2017-12-19 19:04:25 -03:00 
						 
				 
			
				
					
						
							
							
								Martin Vignali 
							
						 
					 
					
						
						
							
						
						3df6e61dad 
					 
					
						
						
							
							avfilter/x86/vf_hflip : indent  
						
						... 
						
						
						
						based on patch by Paul B Mahol 
						
						
					 
					
						2017-12-19 21:10:12 +01:00 
						 
				 
			
				
					
						
							
							
								Martin Vignali 
							
						 
					 
					
						
						
							
						
						f181648176 
					 
					
						
						
							
							avfilter/x86/vf_hflip : add avx2 version for hflip_byte and hflip_short  
						
						
						
						
					 
					
						2017-12-19 21:10:09 +01:00 
						 
				 
			
				
					
						
							
							
								Martin Vignali 
							
						 
					 
					
						
						
							
						
						a4a4179e83 
					 
					
						
						
							
							avfilter/x86/vf_hflip : merge hflip byte and hflip short to one macro  
						
						
						
						
					 
					
						2017-12-19 21:10:05 +01:00 
						 
				 
			
				
					
						
							
							
								Martin Vignali 
							
						 
					 
					
						
						
							
						
						8fb1d63d91 
					 
					
						
						
							
							avfilter/vf_tinterlace : add AVX2 func for lowpass_line 8 and 16  
						
						
						
						
					 
					
						2017-12-19 20:59:59 +01:00 
						 
				 
			
				
					
						
							
							
								Martin Vignali 
							
						 
					 
					
						
						
							
						
						1a5865b6dc 
					 
					
						
						
							
							avfilter/vf_interlace : add AVX2 for lowpass_line 8 and 16  
						
						
						
						
					 
					
						2017-12-19 20:59:54 +01:00 
						 
				 
			
				
					
						
							
							
								Martin Vignali 
							
						 
					 
					
						
						
							
						
						d31770d9a6 
					 
					
						
						
							
							avfilter/vf_interlace : move func init in ff_interlace_init and add depth arg for ff_interlace_init_x86  
						
						
						
						
					 
					
						2017-12-19 20:59:47 +01:00 
						 
				 
			
				
					
						
							
							
								Martin Vignali 
							
						 
					 
					
						
						
							
						
						3c6dc27035 
					 
					
						
						
							
							avfilter/x86/vf_interlace : avfilter/x86/vf_interlace : fix crash when using unaligned data in low_pass complex  
						
						... 
						
						
						
						related to ticket 6491 
						
						
					 
					
						2017-12-15 11:28:29 +01:00 
						 
				 
			
				
					
						
							
							
								Martin Vignali 
							
						 
					 
					
						
						
							
						
						49dced9fd0 
					 
					
						
						
							
							avfilter/x86/vf_interlace : avoid crash when data are unaligned  
						
						... 
						
						
						
						ticket 6491 
						
						
					 
					
						2017-12-15 11:28:25 +01:00 
						 
				 
			
				
					
						
							
							
								Martin Vignali 
							
						 
					 
					
						
						
							
						
						869efbf971 
					 
					
						
						
							
							avfilter/x86/vf_threshold : add threshold16 SIMD (SSE4 and AVX2)  
						
						
						
						
					 
					
						2017-12-09 14:47:09 +01:00 
						 
				 
			
				
					
						
							
							
								James Almer 
							
						 
					 
					
						
						
							
						
						f2aa0ce5a0 
					 
					
						
						
							
							x86/vf_hflip: use xor to zero initialize registers  
						
						... 
						
						
						
						Signed-off-by: James Almer <jamrial@gmail.com > 
						
						
					 
					
						2017-12-07 19:34:12 -03:00 
						 
				 
			
				
					
						
							
							
								James Almer 
							
						 
					 
					
						
						
							
						
						dc33fe1d00 
					 
					
						
						
							
							x86/vf_hflip: don't load the width argument twice  
						
						... 
						
						
						
						Signed-off-by: James Almer <jamrial@gmail.com > 
						
						
					 
					
						2017-12-07 19:34:12 -03:00 
						 
				 
			
				
					
						
							
							
								James Almer 
							
						 
					 
					
						
						
							
						
						cc2ba526d4 
					 
					
						
						
							
							x86/vf_threshold: make threshold8 functions work on x86_32  
						
						... 
						
						
						
						Signed-off-by: James Almer <jamrial@gmail.com > 
						
						
					 
					
						2017-12-04 15:46:09 -03:00 
						 
				 
			
				
					
						
							
							
								Paul B Mahol 
							
						 
					 
					
						
						
							
						
						5ff0d2acae 
					 
					
						
						
							
							avfilter/x86/vf_hflip.asm: fix building on x32  
						
						... 
						
						
						
						Signed-off-by: Paul B Mahol <onemda@gmail.com > 
						
						
					 
					
						2017-12-04 15:08:43 +01:00 
						 
				 
			
				
					
						
							
							
								Paul B Mahol 
							
						 
					 
					
						
						
							
						
						86fda8be3f 
					 
					
						
						
							
							avfilter: add hflip x86 SIMD  
						
						... 
						
						
						
						Signed-off-by: Paul B Mahol <onemda@gmail.com > 
						
						
					 
					
						2017-12-04 09:58:25 +01:00 
						 
				 
			
				
					
						
							
							
								James Almer 
							
						 
					 
					
						
						
							
						
						b73304f79e 
					 
					
						
						
							
							x86vf_threshold/: use the PBLENDVB macro  
						
						... 
						
						
						
						Fixes building with yasm
Tested-by: stevenliu
Signed-off-by: James Almer <jamrial@gmail.com > 
						
						
					 
					
						2017-12-04 02:22:30 -03:00 
						 
				 
			
				
					
						
							
							
								Martin Vignali 
							
						 
					 
					
						
						
							
						
						6e3e696591 
					 
					
						
						
							
							avfilter/x86/vf_threshold : cosmetic indent  
						
						
						
						
					 
					
						2017-12-03 19:17:28 +01:00 
						 
				 
			
				
					
						
							
							
								Martin Vignali 
							
						 
					 
					
						
						
							
						
						9719d57b34 
					 
					
						
						
							
							avfilter/x86/vf_threshold : add avx2 version for threshold 8  
						
						
						
						
					 
					
						2017-12-03 19:17:23 +01:00 
						 
				 
			
				
					
						
							
							
								Martin Vignali 
							
						 
					 
					
						
						
							
						
						51345cb1d5 
					 
					
						
						
							
							avfilter/x86/vf_threshold : make macro for threshold8 in order to add avx2 version  
						
						
						
						
					 
					
						2017-12-03 19:17:19 +01:00 
						 
				 
			
				
					
						
							
							
								Paul B Mahol 
							
						 
					 
					
						
						
							
						
						bbfcb1b7c8 
					 
					
						
						
							
							avfilter/vf_threshold: add x86 SIMD  
						
						... 
						
						
						
						Signed-off-by: Paul B Mahol <onemda@gmail.com > 
						
						
					 
					
						2017-12-02 14:58:56 +01:00 
						 
				 
			
				
					
						
							
							
								James Almer 
							
						 
					 
					
						
						
							
						
						2904db9045 
					 
					
						
						
							
							Merge commit '994c4bc10751e39c7ed9f67ffd0c0dea5223daf2'  
						
						... 
						
						
						
						* commit '994c4bc10751e39c7ed9f67ffd0c0dea5223daf2':
  x86util: Port all macros to cpuflags
See d5f8a642f6jamrial@gmail.com > 
						
						
					 
					
						2017-10-21 12:15:57 -03:00 
						 
				 
			
				
					
						
							
							
								Thomas Mundt 
							
						 
					 
					
						
						
							
						
						40bfaa190c 
					 
					
						
						
							
							avfilter/interlace: add support for 10 and 12 bit  
						
						... 
						
						
						
						Reviewed-by: Michael Niedermayer <michael@niedermayer.cc >
Signed-off-by: Thomas Mundt <tmundt75@gmail.com >
Signed-off-by: James Almer <jamrial@gmail.com > 
						
						
					 
					
						2017-09-23 16:19:58 -03:00 
						 
				 
			
				
					
						
							
							
								Thomas Mundt 
							
						 
					 
					
						
						
							
						
						a7f6bfdc18 
					 
					
						
						
							
							avfilter/interlace: prevent over-sharpening with the complex low-pass filter  
						
						... 
						
						
						
						The complex vertical low-pass filter slightly over-sharpens the picture. This becomes visible when several transcodings are cascaded and the error potentises, e.g. some generations of HD->SD SD->HD.
To prevent this behaviour the destination pixel must not exceed the source pixel when the average of the pixels above and below is less than the source pixel. And the other way around.
Tested and approved in a visual transcoding cascade test by video professionals.
SSIM/PSNR test with the first generation of an HD->SD file as a reference against the 6th generation(3 x SD->HD HD->SD):
Results without the patch:
SSIM Y:0.956508 (13.615881) U:0.991601 (20.757750) V:0.993004 (21.551382) All:0.974405 (15.918463)
PSNR y:31.838009 u:48.424280 v:48.962711 average:34.759466 min:31.699297 max:40.857847
Results with the patch:
SSIM Y:0.970051 (15.236232) U:0.991883 (20.905857) V:0.993174 (21.658049) All:0.981290 (17.279202)
PSNR y:34.412108 u:48.504454 v:48.969496 average:37.264644 min:34.310637 max:42.373392
Signed-off-by: Thomas Mundt <tmundt75@gmail.com >
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc > 
						
						
					 
					
						2017-09-15 22:40:21 +02:00 
						 
				 
			
				
					
						
							
							
								Paul B Mahol 
							
						 
					 
					
						
						
							
						
						f8d0689d3f 
					 
					
						
						
							
							avfilter/vf_blend: rename addition128 and difference128 to grainmerge and grainextract  
						
						
						
						
					 
					
						2017-08-24 14:45:52 +02:00 
						 
				 
			
				
					
						
							
							
								James Almer 
							
						 
					 
					
						
						
							
						
						5688fd77b5 
					 
					
						
						
							
							x86/vf_limiter: make limiter functions work on x86_32  
						
						... 
						
						
						
						Signed-off-by: James Almer <jamrial@gmail.com > 
						
						
					 
					
						2017-07-13 18:17:17 -03:00 
						 
				 
			
				
					
						
							
							
								Paul B Mahol 
							
						 
					 
					
						
						
							
						
						01e545d046 
					 
					
						
						
							
							avfilter: add limiter filter  
						
						... 
						
						
						
						Signed-off-by: Paul B Mahol <onemda@gmail.com > 
						
						
					 
					
						2017-07-08 11:49:54 +02:00 
						 
				 
			
				
					
						
							
							
								James Almer 
							
						 
					 
					
						
						
							
						
						d2ef9e6e7f 
					 
					
						
						
							
							x86/vf_blend: use ABS2 macro  
						
						
						
						
					 
					
						2017-06-27 20:45:55 -03:00