Convert any PDF to an audiobook using Python and a little help from ChatGPT

The code below will convert a PDF to a mp3 with spoken text. Keep in mind the resulting mp3 will be very large if the PDF is lengthy. Verified working on a Windows machine with Python 3.9.2.

import pyttsx3
import PyPDF2

# Open the PDF file in read-binary mode
pdf_file = open('yourPDF.pdf', 'rb')

# Read the PDF file using PyPDF2
pdf_reader = PyPDF2.PdfReader(pdf_file)

# Store the amount of pages
totalpages = len(pdf_reader.pages)

# Initialize the TTS engine
engine = pyttsx3.init()

# Loop through each page in the PDF and extract the text
text = ""
for page_num in range(totalpages):
    page = pdf_reader.pages[page_num]
    text += page.extract_text()
 

    # Use the TTS engine to speak the text
    engine.say(text)

# Save the audio to an MP3 file
engine.save_to_file(text, 'yourAudiobook.mp3')

# Run the TTS engine
engine.runAndWait()

# Close the PDF file
pdf_file.close()

Author: PS

Share This Post On