faceswap extract - Ofer Nave

# AI : Faceswap : CLI : faceswap extract ``` python faceswap.py extract -i INPUT -o OUTPUT [OPTIONS] ``` ## Default Behavior Given a video: ``` $ python faceswap.py extract --input-dir ~/foo/foo.mp4 --output-dir ~/foo/__all__ -N 12 foo.mp4 # 30s @ 23.98fps = 720f foo_alignments.fsa # 62 faces across 60 frames foo_000001.mp4 # 60 frames (every 12th from 1 to 720) ... foo_000709.mp4 __all__/ foo_000001_0.png # 62 face PNGs (512x512) ... foo_000709_0.png ``` Given a folder containing images and videos: ``` $ python faceswap.py extract --input-dir ~/bar/bar --output-dir ~/bar/__all__ bar/photo1.jpg bar/photo2.jpeg bar/video.mp4 bar/alignments.fsa # 2 images, 5 faces (video ignored) photo1.jpg photo2.jpeg bar/__all__/ photo1_0.png # 5 face PNGs (512x512) ... photo2_2.png ``` ## Options ``` -i|--input-dir DIR Input directory or video. Either a directory containing the image files you wish to process or path to a video file. NB: This should be the source video/frames NOT the source faces. -o|--output-dir DIR Output directory. This is where the converted files will be saved. -p|--alignments PATH Optional path to an alignments file. Leave blank if the alignments file is at the default location. -b|--batch-mode If selected then the input_dir should be a parent folder containing multiple videos and/or folders of images you wish to extract from. The faces will be output to separate sub-folders in the output_dir. -D|--detector {cv2-dnn,external,mtcnn,s3fd} Detector to use. Some of these have configurable settings in '/config/extract.ini' or 'Settings > Configure Extract 'Plugins': - cv2-dnn : A CPU only extractor which is the least reliable and least resource intensive. Use this if not using a GPU and time is important. - mtcnn : Good detector. Fast on CPU, faster on GPU. Uses fewer resources than other GPU detectors but can often return more false positives. - s3fd : Best detector. Slow on CPU, faster on GPU. Can detect more faces and fewer false positives than other GPU detectors, but is a lot more resource intensive. - external : Import a face detection bounding box from a json file. (configurable in Detect settings) -A|--aligner {cv2-dnn,external,fan} Aligner to use. - cv2-dnn : A CPU only landmark detector. Faster, less resource intensive, but less accurate. Only use this if not using a GPU and time is important. - fan : Best aligner. Fast on GPU, slow on CPU. - external : Import 68 point 2D landmarks or an aligned bounding box from a json file. (configurable in Align settings) -M|--masker {bisenet-fp,custom,unet-dfl,vgg-clear,vgg-obstructed} [{bisenet-fp,custom,unet-dfl,vgg-clear,vgg-obstructed} ...] Additional Masker(s) to use. The masks generated here will all take up GPU RAM. You can select none, one or multiple masks, but the extraction may take longer the more you select. NB: The Extended and Components (landmark based) masks are automatically generated on extraction. - bisenet-fp : Relatively lightweight NN based mask that provides more refined control over the area to be masked including full head masking (configurable in mask settings). - custom : A dummy mask that fills the mask area with all 1s or 0s (configurable in settings). This is only required if you intend to manually edit the custom masks yourself in the manual tool. This mask does not use the GPU so will not use any additional VRAM. - vgg-clear : Mask designed to provide smart segmentation of mostly frontal faces clear of obstructions. Profile faces and obstructions may result in sub-par performance. - vgg-obstructed : Mask designed to provide smart segmentation of mostly frontal faces. The mask model has been specifically trained to recognize some facial obstructions (hands and eyeglasses). Profile faces may result in sub-par performance. - unet-dfl : Mask designed to provide smart segmentation of mostly frontal faces. The mask model has been trained by community members and will need testing for further description. Profile faces may result in sub-par performance. The auto generated masks are as follows: - components : Mask designed to provide facial segmentation based on the positioning of landmark locations. A convex hull is constructed around the exterior of the landmarks to create a mask. - extended : Mask designed to provide facial segmentation based on the positioning of landmark locations. A convex hull is constructed around the exterior of the landmarks and the mask is extended upwards onto the forehead. (eg: `-M unet-dfl vgg-clear`, `--masker vgg-obstructed`) -O|--normalization {none,clahe,hist,mean} Performing normalization can help the aligner better align faces with difficult lighting conditions at an extraction speed cost. Different methods will yield different results on different sets. NB: This does not impact the output face, just the input to the aligner. - none : Don't perform normalization on the face. - clahe : Perform Contrast Limited Adaptive Histogram Equalization on the face. - hist : Equalize the histograms on the RGB channels. - mean : Normalize the face colors to the mean. -R|--re-feed N The number of times to re-feed the detected face into the aligner. Each time the face is re-fed into the aligner the bounding box is adjusted by a small amount. The final landmarks are then averaged from each iteration. Helps to remove 'micro-jitter' but at the cost of slower extraction speed. The more times the face is re-fed into the aligner, the less micro-jitter should occur but the longer extraction will take. -a|--re-align Re-feed the initially found aligned face through the aligner. Slows down extraction. Can help produce better alignments for faces that are rotated beyond 45 degrees in the frame or are at extreme angles. -r|--rotate-images IMGS If a face isn't found, rotate the images to try to find a face. Can find more faces at the cost of extraction speed. Pass in a single number to use increments of that size up to 360, or pass in a list of numbers to enumerate exactly what angles to check. -I|--identity Obtain and store face identity encodings from VGGFace2. Slows down extract a little, but will save time if using 'sort by face'. -m|--min-size N Filters out faces detected below this size. Length, in pixels across the diagonal of the bounding box. Set to 0 for off. -n|--nfilter NFILTER1 [...] Optionally filter out people who you do not wish to extract by passing in images of those people. Should be a small variety of images at different angles and in different conditions. A folder containing the required images or multiple image files, space separated, can be selected. -f|--filter FILTER1 [...] Optionally select people you wish to extract by passing in images of that person. Should be a small variety of images at different angles and in different conditions. A folder containing the required images or multiple image files, space separated, can be selected. -l|--ref_threshold VAL For use with the optional nfilter/filter files. Threshold for positive face recognition. Higher values are stricter. -z|--size N The output size of extracted faces. Make sure that the model you intend to train supports your required size. This will only need to be changed for hi-res models. -N|--extract-every-n N Extract every 'nth' frame. This option will skip frames when extracting faces. For example a value of 1 will extract faces from every frame, a value of 10 will extract faces from every 10th frame. -v|--save-interval N Automatically save the alignments file after a set amount of frames. Set to 0 to turn off. By default the alignments file is only saved at the end of the extraction process. NB: If extracting in 2 passes then the alignments file will only start to be saved out during the second pass. WARNING: Don't interrupt the script when writing the file because it might get corrupted. -B|--debug-landmarks Draw landmarks on the ouput faces for debugging purposes. -P|--singleprocess Don't run extraction in parallel. Will run each part of the extraction process separately (one after the other) rather than all at the same time. Useful if VRAM is at a premium. -s|--skip-existing Skips frames that have already been extracted and exist in the alignments file. -e|--skip-existing-faces Skip frames that already have detected faces in the alignments file. -K|--skip-saving-faces Skip saving the detected faces to disk. Just create an alignments file. ```