Part 1: Import Libraries
import cv2
import numpy as np
import pygetwindow as gw
import pyaudio
import wave
- cv2: This is OpenCV, used for handling video and image operations.
- numpy: A library for numerical operations. Here, it’s used to manipulate the image data.
- pygetwindow: Used to access the screen’s properties and capture screenshots.
- pyaudio: For capturing audio from the microphone or system audio.
- wave: To save audio data into a WAV file format.
Part 2: Function Definition
def screen_and_audio_recorder():
This line defines a function called screen_and_audio_recorder
. All the code that follows up until the function is called is part of this function. This function handles both screen capturing and audio recording.
Part 3: Screen Capture Setup
screen = gw.getWindowsWithTitle('')[0]
screen_width, screen_height = screen.width, screen.height
- The
gw.getWindowsWithTitle('')[0]
gets the first window with no title, which generally refers to the entire screen in many setups. screen_width
andscreen_height
store the dimensions of the screen, which are used to set the video frame size.
Part 4: Video Writing Setup
fourcc = cv2.VideoWriter_fourcc(*'XVID')
out = cv2.VideoWriter('screen_audio_recording.avi', fourcc, 8.0, (screen_width, screen_height))
fourcc
is a video codec code used to specify the video codec. ‘XVID’ is a codec used to compress and decompress video files.out
is an object created bycv2.VideoWriter()
, which is used to write frames into a filescreen_audio_recording.avi
. The parameters include the codec, frame rate (8.0 FPS here), and the frame size.
Part 5: Audio Recording Setup
audio_format = pyaudio.paInt16
channels = 2
rate = 44100
chunk = 1024
audio = pyaudio.PyAudio()
stream = audio.open(format=audio_format, channels=channels, rate=rate, input=True, frames_per_buffer=chunk)
frames = []
audio_format
,channels
,rate
, andchunk
are parameters defining the audio quality and the way audio is chunked for processing.audio
is an instance ofPyAudio()
used to interact with the audio hardware.stream
is used to open the audio stream for recording. Itโs configured to capture stereo audio at a rate of 44100 Hz (CD quality).frames
is a list used to store the chunks of audio data captured during the recording.
Part 6: Recording Loop
while True:
img = gw.screenshot(region=(screen.left, screen.top, screen_width, screen_height))
frame = np.array(img)
frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
out.write(frame)
audio_data = stream.read(chunk)
frames.append(audio_data)
cv2.imshow('Screen and Audio Recorder', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
- The loop runs indefinitely, capturing screenshots and audio chunks.
img
captures a screenshot of the specified region (the entire screen).frame
converts the screenshot to an array and adjusts the color format.- The video frame is written to the output file, and the audio data is appended to the
frames
list. - The
cv2.imshow()
function displays the frame in a window. - The loop breaks when the ‘q’ key is pressed.
Part 7: Clean-up and Saving Audio
stream.stop_stream()
stream.close()
audio.terminate()
out.release()
cv2.destroyAllWindows()
with wave.open('audio_recording.wav', 'wb') as wf:
wf.setnchannels(channels)
wf.setsampwidth(audio.get_sample_size(audio_format))
wf.setframerate(rate)
wf.writeframes(b''.join(frames))
- Stops the audio stream, closes it, and terminates the PyAudio session.
- Releases the video file and closes all OpenCV windows.
- Saves the captured audio data into a WAV file, setting the necessary audio parameters like channels and sample width.
Complete Python Script for Screen and Audio Recording
import cv2
import numpy as np
import pygetwindow as gw
import pyaudio
import wave
def screen_and_audio_recorder():
# Get the primary screen's dimensions
screen = gw.getWindowsWithTitle('')[0]
screen_width, screen_height = screen.width, screen.height
# Define the codec and create VideoWriter object to save the video
fourcc = cv2.VideoWriter_fourcc(*'XVID')
out = cv2.VideoWriter('screen_audio_recording.avi', fourcc, 8.0, (screen_width, screen_height))
# Audio recording setup
audio_format = pyaudio.paInt16
channels = 2
rate = 44100
chunk = 1024
audio = pyaudio.PyAudio()
stream = audio.open(format=audio_format, channels=channels, rate=rate, input=True, frames_per_buffer=chunk)
frames = []
while True:
# Capture the screen
img = gw.screenshot(region=(screen.left, screen.top, screen_width, screen_height))
frame = np.array(img)
frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
# Write the video frame
out.write(frame)
# Capture audio
audio_data = stream.read(chunk)
frames.append(audio_data)
# Display the recording frame
cv2.imshow('Screen and Audio Recorder', frame)
# Stop recording when 'q' is pressed
if cv2.waitKey(1) & 0xFF == ord('q'):
break
# Release everything after recording
stream.stop_stream()
stream.close()
audio.terminate()
out.release()
cv2.destroyAllWindows()
# Save the recorded audio to a wave file
with wave.open('audio_recording.wav', 'wb') as wf:
wf.setnchannels(channels)
wf.setsampwidth(audio.get_sample_size(audio_format))
wf.setframerate(rate)
wf.writeframes(b''.join(frames))
# Call the function to start recording
screen_and_audio_recorder()
How to Run the Script
Install Required Libraries: Make sure you have the necessary Python libraries installed. If not, you can install them using the following command in your terminal or command prompt:
pip install opencv-python pygetwindow pyaudio
Conclusion
This script is a comprehensive solution for screen and audio recording. When executed, it will produce a video file of the screen activity and a separate audio file.
You could later use video editing software to synchronize and combine these into a single video file with audio.

Ishaan is a passionate programmer with a knack for simplifying complex ideas. With expertise in Python and PHP, he enjoys creating small projects that help others learn and apply coding skills practically. Through his articles, Ishaan aims to inspire and educate budding developers.