Tutorials on Audio

Learn about Audio from fellow newline community members!

  • React
  • Angular
  • Vue
  • Svelte
  • NextJS
  • Redux
  • Apollo
  • Storybook
  • D3
  • Testing Library
  • JavaScript
  • TypeScript
  • Node.js
  • Deno
  • Rust
  • Python
  • GraphQL
  • React
  • Angular
  • Vue
  • Svelte
  • NextJS
  • Redux
  • Apollo
  • Storybook
  • D3
  • Testing Library
  • JavaScript
  • TypeScript
  • Node.js
  • Deno
  • Rust
  • Python
  • GraphQL

ffmpeg - Editing Audio and Video Content (Part 1)

Online streaming and multimedia content platforms garner a large audience and consume a disproportionate amount of bandwidth compared to other types of platforms. These platforms rely on content creators to upload, share and promote their videos and music. To process and polish video and audio files, both professionals and amateurs automatically resort to using interactive software, such as Adobe Premiere. Such software features many tools to unleash the creativity of its users, but each comes with its own set of entry barriers (learning curve and pricing) and unique workflows for editing tasks. For example, in Adobe Premiere , to manually concatenate footage together, you create a nested sequence, which involves several steps of creating sequences and dragging and dropping clips into a workspace's timeline. If you produce lots of content weekly for a platform, such as YouTube, and work on a tight schedule that leaves no extra time for video editing, then you may consider hiring a devoted video editor to handle the video editing for you. Fortunately, you can develop a partially autonomous workflow for video editing by offloading certain tedious tasks to FFmpeg. FFmpeg is a cross-platform, open-source library for processing multimedia content (e.g., videos, images and audio files) and converting between different video formats (i.e., MP4 to WebM ). Commonly, developers use FFmpeg via the ffmpeg CLI tool, but there are language-specific bindings written for FFmpeg to import it as a package/dependency into your project/s. With ffmpeg , Bash scripts can automate your workflow with simple, single-line commands, whether it is making montages, replacing a video's audio with stock background music or streamlining bulk uploads. This either significantly reduces or completely eliminates your dependence on a user interface to manually perform these tasks by moving around items, clicking buttons, etc. Below, I'm going to show you... Some operating systems already have ffmpeg installed. To check, simply type ffmpeg into the terminal. If the command is already installed, then the terminal prints a synopsis of ffmpeg . If ffmpeg is not yet installed on your machine, then visit the FFmpeg website, navigate to the " Download " page, download a compiled executable (compatible with your operating system) and execute it once the download is complete. Note : It is recommended to install the stable build to avoid unexpected bugs. Alternatively... For extensive documentation, enter the command man ffmpeg , which summons man ual pages for the ffmpeg command: For this blog post, I will demonstrate the versatility of ffmpeg using the Big Buck Bunny video, an open-source, animated film built using Blender. Because downloading from the official Big Buck Bunny website might be slow for some end users, download the ten second Big Buck Bunny MP4 video ( 30 MB, 640 x 360 ) from Test Videos . The wget CLI utility downloads files from the web. Essentially, this command downloads the video from Wikimedia Commons to the current directory, and this downloaded video is named Big_Buck_Bunny_360_10s_30MB.mp4 . The -c option tells wget to resume an interrupted download from the most recent download position, and the -O option tells wget to download the file to a location of your choice and customize the name of the downloaded file. The ffmpeg command follows the syntax: For a full list of options supported by ffmpeg , consult the documentation . Square brackets and curly braces indicate optional items. Items grouped within square brackets are not required to be mutually exclusive, whereas items grouped within curly braces are required to be mutually exclusive. For example, you can provide the -i option with a path of the input file ( infile ) to ffmpeg without any infile options. However, to provide any outfile option, ffmpeg must be provided the path of the output file ( outfile ). To specify an input media file, provide its path to the -i option. Unlike specifying an input media file, specifying an output media file does not require an option; it just needs to be the last argument provided to the ffmpeg command. To print information about a media file, run the following command: Just providing an input media file to the ffmpeg command displays its details within the terminal. Here, the Metadata contains information such as the video's title ("Big Buck Bunny, Sunflower version") and encoder ("Lavf54.20.4"). The video runs for approximately ten and a half minutes at 30k FPS. To strip away the FFmpeg banner information (i.e., the FFmpeg version) from the output of this command, provide the -hide_banner option. That's much cleaner! To convert a media file to a different format, provide the outfile path (with the extension of the format). For example, to convert a MP4 file to an WebM file... Note : Depending on your machine's hardware, you may need to be patient for large files! To find out all the formats supported by ffmpeg , run the following command: To reduce the amount of bandwidth consumed by users watching your videos on a mobile browser or save space on your hard/flash drive, compress your videos by: Here, we specify a video filter with the -vf option. We pass a scale filter to this option that scales down the video to a quarter of its original width and height. The original aspect ratio is not preserved. Note : To preserve aspect ratio, you need to set either the target width or height to -1 (i.e., scale=360:-1 sets the width to 360px and the height to a value calculated based on this width and the video's aspect ratio). The output file is less than 100 KBs! Here, we specify the H.265 video codec by setting the -c:v option to libx265 . The -preset defines the speed of the encoding. The faster the encoding, the worst the compression, and vice-versa. The default preset is medium , but we set it to fast , which is just one level above medium in terms of speed. The CRF is set to 28 for the default quality maintained by the codec. The -tag:v option is set to hvc1 to allow QuickTime to play this video. The output file is less than 500 KBs, and it still has the same aspect ratio and dimensions as the original video while also maintaining an acceptable quality! Unfortunately, because browser support for H.265 is sparse , videos compressed with this standard cannot be viewed within most major browsers (e.g., Chrome and Firefox). Instead, use the H.264 video codec, an older standard that offers worst compression ratios (larger compressed files, etc.) compared to H.265, to compress videos. Videos compressed with this standard can be played in all major browsers . Note : We don't need to provide the additional -tag:v option since QuickTime automatically knows how to play videos compressed with H.264. Note : 23 is the default CRF value for H.264 (visually corresponds to 28 for H.265, but the size of a H.264 compressed file will be twice that of a H.265 compressed file). Notice that the resulting video ( Big_Buck_Bunny_360_10s_30MB_codec_2.mp4 ) is now twice that of previous ( Big_Buck_Bunny_360_10s_30MB_codec.mp4 ), but now, you have a video that can be played within all major browsers. Simply drag and drop these videos into separate tabs of Chrome or Firefox to see this. Big_Buck_Bunny_360_10s_30MB_codec_2.mp4 in Firefox: Big_Buck_Bunny_360_10s_30MB_codec.mp4 in Firefox: Check out this codec compatibility table to ensure you choose the appropriate codec based on your videos and the browsers you need to support. Much like formats, to find out all the codecs supported by ffmpeg , run the following command: First, let's download another video, the ten second Jellyfish MP4 video ( 30 MB, 640 x 360 ), from Test Videos. To concatenate this video to the Big Buck Bunny video, run the following command: Since both video files are both MP4s and encoded with the same codec and parameters (e.g., dimensions and time base), they can be concatenated by passing them through a demuxer , which extracts a list of video files from an input text file and demultiplexes the individual streams (e.g., audio, video and subtitles) of each video files, and then multiplexing the constituent streams into a coherent stream. Essentially, this command concatenates audio to audio, video to video, subtitles to subtitles, etc., and then combines these concatenations together into a single video file. By omitting the decoding and encoding steps for the streams (via -c copy ), the command quickly concatenates the files with no loss in quality. Note : Setting the -safe option to 0 allows the demuxer to accept any file, regardless of protocol specification. If you are just concatenating files referenced via relative paths, then you can omit this option. When you play the concatenated.mp4 video file, you will notice that this video's duration is 20 seconds. It starts with the Big Buck Bunny video, and then immediately jumps to the Jellyfish video at the 10 second mark. Note : If the input video files are encoded differently or are not of the same format, then you must re-encode all of the video files with the same codec before concatenating them. Suppose you wanted to merge the audio of a video with stock background music to fill the silence. To do this, you must provide the video file and stock background music file as input files for ffmpeg . Then, we specify the video codec ( -c:v ) to be copy to tell FFmpeg to copy the video's bitstream directly to the output with zero quality changes, and we specify the audio codec ( -c:a ) to be aac (for Advanced Audio Coding ) to tell FFmpeg to encode the audio to an MP4-friendly format. Since our audio file will be MP3, which can be handled by an MP4 container, you can omit the -c:a option. To prevent the video from lasting as long as the two and a half minute audio file, and only lasting as long as the original video, add the -shortest option to tell FFmpeg to stop encoding once the shortest input file (in this case, the ten second Big Buck Bunny video) is finished. Additionally, download the audio file Ukulele from Bensound . If your audio file happens have a shorter duration than your video file, and you want to continuously loop the audio file until the end of the video, then pass the -stream_loop option to FFmpeg. Set its value to -1 to infinitely loop over the input stream. Note : The -stream_loop option is applied to the input file that comes directly after it in the command, which happens to be the short.mp3 file. This audio file has a duration less than the video file. Consult the FFmpeg Documentation to learn more about all of the different video processing techniques it provides.

Thumbnail Image of Tutorial ffmpeg - Editing Audio and Video Content (Part 1)

Jam on your MIDI keyboard in Angular

Web MIDI API is an interesting tool. Even though it has been around for a while now, it is still only supported by Chrome. But that's not gonna stop us from creating a playable synthesizer in Angular. It is time we bring Web Audio API to the next level!Previously, we spoke about declarative use of Web Audio API in Angular . Programming music is fun and all but how about actually playing it? There is a standard called MIDI. It's a messaging protocol for electronic instruments data exchange developed back in the 80s. And Chrome supports it natively . It means if you have a synthesizer or a MIDI keyboard — you can hook it up to a computer and read what's played in the browser. You can even control other hardware from your machine. Let's learn how to do it the Angular way. There's not a lot of documentation about it other than the specs. You request access to MIDI devices from navigator and you receive all MIDI inputs and outputs in a Promise . Those inputs and outputs (also called ports) act like event targets. Communication is performed through MIDIMessageEvents which carry Uint8Array data. Each message is 3 bytes at most. The first one is called a status byte . Every number has a particular role like key press or pitch bend. Second and third integers are called data bytes . In the case of key press second byte tells us what key is pressed and the third one is the velocity (how loud note is played). Full spec is available on the official MIDI website . In Angular we handle events with Observables so first step to adopt Web MIDI API would be to convert it to RxJs. To subscribe to events we first need to get MIDIAccess object to reach all inputs. As mentioned before, we request it from navigator and we get a Promise as a response. Thankfully, RxJs works with Promises too. We can create an Injection Token using NAVIGATOR from @ng-web-apis/common package. This way we are not using global objects directly: Now that we have it, we can subscribe to all MIDI events. We can make Observable two ways: Since in this case there is not much set up required a token would suffice. With a bit of extra code to handle Promise rejection, subscribing to all events would look like this: We can extract particular MIDI port from MIDIAccess in case we want to, say, send an outgoing message. Let's add another token and a prepared provider to do this with ease: To work with our stream we need to add some custom operators. After all, we shouldn't always analyze raw event data to understand what we're dealing with. Operators can be roughly broken into two categories: monotype and mapping . With the first group, we can filter out the stream to events of interest. For example, only listen to played notes or volume sliders. Second group would alter elements for us, like dropping the rest of the event and only deliver data array. Here's how we can listen to messages only from a given channel (out of 16): The status byte is organized in groups of 16: 128–143 are noteOff messages for each of the 16 channels, 144–159 are noteOn, etc. So if we divide status byte by 16 and get the reminder — we end up with that message's channel. If we only care about played notes, we can write the following operator: Now we can chain such operators to get a stream we need: Time put all this to work! With a little help from our Web Audio API library discussed in my previous article , we can create a nice sounding synth with just a few directives. Then we need to feed it played notes from the stream we've assembled. We will use the last code example as a starting point. To have polyphonic synthesizer we need to keep track of all played notes so we will add scan to our chain: To alter the volume of held keys and to not end sound abruptly when we let go we will create a proper ADSR-pipe (the previous article had a simplified version): With this pipe we can throw in a fine synth in the template: We iterate over accumulated notes with built-in keyvalue pipe tracking items by played key. Then we have 2 oscillators playing those frequencies. And at the end — a reverberation effect with ConvolverNode . Pretty basic setup and not a lot of code, but it gives us a playable instrument with rich sound. You can go ahead and give it a try in our interactive demo below. In Angular, we are used to working with events using RxJs. And Web MIDI API is not much different from regular events. With some architectural decisions and tokens, we managed to use MIDI in an Angular app. The solution we created is available as @ng-web-apis/midi open-source package. It focuses mostly on receiving events. If you see something that is missing like a helper function or another operator — feel free to open an issue . This library is a part of a bigger project called Web APIs for Angular — an initiative to create lightweight, high-quality wrappers of native APIs for idiomatic use with Angular. So if you want to try Payment Request API or need a declarative Intersection Observer — you are very welcome to browse all our releases so far .

Thumbnail Image of Tutorial Jam on your MIDI keyboard in Angular

I got a job offer, thanks in a big part to your teaching. They sent a test as part of the interview process, and this was a huge help to implement my own Node server.

This has been a really good investment!

Advance your career with newline Pro.

Only $30 per month for unlimited access to over 60+ books, guides and courses!

Learn More

Writing Retrowave in Angular

Web Audio API has been around for a while now and there are lots of great articles about it. So I will not go into details regarding the API. What I will tell you is Web Audio can be Angular's best friend if you introduce it well. So let's do this.In Web Audio API you create a graph of audio nodes that process the sound passing through them. They can change volume, introduce delay or distort the signal. Browsers have special AudioNodes with various parameters to handle this. Initially, one would create them with factory functions of AudioContext : But since then they became proper constructors which means you can extend them. This allows us to elegantly and declaratively use Web Audio API in Angular. Angular directives are classes and they can extend existing native classes. Typical feedback loop to create echo effect with Web Audio looks like this: We can see that vanilla code is purely imperative. We create objects, set parameters, manually assemble the graph using connect method. In the example above we use HTML audio tag. When user presses play he would hear echo on his audio file. We will replicate this case using directives. AudioContext will be delivered through Dependency Injection. Both GainNode and DelayNode have only one parameter each — gain and delay time. That is not just a number, it is an AudioParam . We will see what that means a bit later. To declaratively link our nodes into graph we will add AUDIO_NODE token. All our directives will provide it. Directives take closest node from DI and connect with it. We've also added exportAs — it allows us to grab node with template reference variables . Now we can build graph with template: We will end a branch and direct sound to the speakers with waAudioDestinationNode : To be able to create loops like in the echo example above Dependency Injection is not enough. We will make a special directive. It would allow us to pass node as input to connect to it: Both those directives extend GainNode which creates an extra node in the graph. It allows us to disconnect it in ngOnDestroy easily. We do not need to remember everything that is connected to our directive. We can just disconnect this from everything at once. The last directive we need to complete our example is a bit different. It's a source node and it's always at the top of our graph. We will put a directive on audio tag and it will turn it into MediaElementAudioSourceNode for us: Now let's create the echo example with our directives: There are lots of different nodes in Web Audio API but all of them can be implemented using similar approach. Two other important source nodes are OscillatorNode and AudioBufferSourceNode . Often we do not want to add anything into DOM. And there is no need to provide audio file controls to the user. In that case AudioBufferSourceNode is a better option than audio tag. Only inconvenience is — it works with AudioBuffer unlike audio tag which takes a link to an audio asset. We can create a service to mitigate that: Now we can create a directive that works both with AudioBuffer and audio asset URL: Audio nodes have a special kind of properties — AudioParam . For example gain in GainNode . That's why we used setter for it. Such property value can be automated. You can set it to change linearly, exponentially or even over an array of values in a given time. We need some sort of handler which would allow us to take care of this for all such inputs of our directives. Decorator is a good option for this case: Decorator would pass processing to a dedicated function: Strong types will not allow us to accidentally use it for a non-existent parameter. So what would AudioParamInput type look like? Besides number it would include an automation object: processAudioParam function translates those objects into native API commands. It's pretty boring so I will just describe the principle. If current value is 0 and we want it to linearly change to 1 in a second we would pass {value: 1, duration: 1, mode: ‘linear’} . For complex automation we would also need to support an array of such objects. We would typically pass an automation object with short duration instead of plane number . It prevents audible clicking artifacts when parameter changes abruptly. But it's not convenient to do it manually all the time. Let's create a pipe that would take target value, duration and optional mode as arguments: Besides, AudioParam can be automated by connecting an oscillator to it. Usually a frequency lower than 1 is used and it is called an LFO — Low Frequency Oscillator. It can create movement in sound. In example below it adds texture to otherwise static chords. It modulates frequency of a filter they pass through. To connect oscillator to a parameter we can use our waOutput directive. We can access node thanks to exportAs : Web Audio API can be used for different things. From real time processing of a voice for a podcast to math computations, Fourier transforms and more. Let's compose a short music piece using our directives: We will start with simple task — straight drum beat. To count beats we will create a stream and add it to DI: We have 4 beats per measure. Let's map our stream: Now it gives us true in the beginning and false in the middle of each bar. We would use it to play audio samples: Now let's add melody. We will use numbers to indicate notes where 69 means middle A note. Function that translates this number to frequency can be easily found on Wikipedia. Here's our tune: Our component will play right frequency for each note each beat: And inside its template we will have a real synthesizer! But first we need another pipe. It would automate volume with ADSR-envelope. That means "Attack, Decay, Sustain, Release" and here's how it looks: In our case we need for the sound to quickly start and then fade away. Pipe is rather simple: Now we will use it for our synth tune: Let's figure out what's going on here. We have two oscillators. First one is just a sine wave passed through ADSR pipe. Second one is same echo loop we've seen, except this time it passes through ConvolverNode . It creates room acoustics using impulse response. It's a big an interesting subject of its own, but it is outside this article's scope. All other tracks in our song are made similarly. Nodes are connected to each other, parameters are automated with LFOs or change smoothly via our pipes. I only went over a small portion of this subject, simplifying corner cases. We've made a complete conversion of Web Audio API into declarative Angular open-source library @ng-web-apis/audio . It covers all the nodes and features. This library is a part of a bigger project called Web APIs for Angular — an initiative with a goal of creating lightweight, high quality wrappers of native APIs for idiomatic use with Angular. So if you want to try, say, Payment Request API or play with your MIDI keyboard in browser — you are very welcome to browse all our releases so far .

Thumbnail Image of Tutorial Writing Retrowave in Angular