WordPress文章导出PDF
最近需要ima的知识库提供我的博客的文章,需要pdf文件,尝试了用PDF-Print插件等方法由于文章太多导出总是报错,于是通过DeepSeek提供的Python导出代码实现了文章批量导出,因为用的是WordPress的官方API获取文章的接口是分页查询的每页最多100条,我的博客有7900多条,于是再次提问DeepSeek给出了分页导出方案,具体代码如下:
import os
import time
import requests
import pdfkit
# WordPress REST API URL
api_url = "https://www.yourdomain.com/wp-json/wp/v2/posts"
# 设置pdfkit的wkhtmltopdf路径(根据你的安装路径配置)
# Windows示例:
pdfkit_config = pdfkit.configuration(wkhtmltopdf=r'D:\Program Files\wkhtmltopdf\bin\wkhtmltopdf.exe')
# macOS/Linux示例:
#pdfkit_config = pdfkit.configuration(wkhtmltopdf='/usr/local/bin/wkhtmltopdf')
# 获取文章列表
response = requests.get(api_url)
posts = response.json()
# 定义 CSS 样式(关键!)
css = """
"""
# 导出每篇文章为PDF
for post in posts:
try:
title = post['title']['rendered']
content = post['content']['rendered']
# 拼接完整的 HTML(包含编码声明和 CSS)
html = f"""
{css}
{title}
{content}
"""
# 创建PDF文件名
filename = f"{title}.pdf"
# 将HTML内容转换为PDF
pdfkit.from_string(html, filename, configuration=pdfkit_config)
print(f"Exported: {filename}")
# 释放资源
time.sleep(2)
if os.name == 'nt':
os.system("taskkill /f /im wkhtmltopdf.exe")
except Exception as e:
print(f"exception message:{title}\n errormessage:{e}")
continue
print("All posts have been exported as PDFs.")