Skip to main content
somark-document-parser is an AI Agent Skill that enables Claude Code, Cursor, Cline, OpenCode, and 40+ other AI coding assistants to truly understand PDFs, images, Word files, and PowerPoint files — not just OCR’d text, but proper headings, tables, formulas, and layout.
1

Install

npx skills add https://github.com/SoMarkAI/somark-document-parser
Compatible with Claude Code, Cursor, Cline, OpenCode, and 40+ other AI agents.
2

Set up your API key

Get an API key at somark.tech, then set it as an environment variable:
export SOMARK_API_KEY=sk-your-api-key
You can also configure it in your agent’s settings. The skill will guide you through setup on first use.
SoMark includes a free tier (500 pages/day, 10000 pages/month) that is automatically credited to your account. After the free quota is exceeded, the system automatically switches to paid quota.
3

Use

Give your AI assistant natural language instructions and it will call SoMark automatically:
  • “Parse this PDF for me”
  • “Extract the key clauses from this contract”
  • “Summarize the paper I just uploaded”
  • “Convert this document to Markdown”
  • “What does this image say?”

Supported File Formats

TypeFormats
DocumentsPDF, DOC, DOCX, PPT, PPTX
ImagesPNG, JPG, JPEG, BMP, TIFF, WEBP, HEIC, HEIF, GIF

Why SoMark

Most agents struggle with documents because raw PDF/image data loses structure. SoMark preserves:
  • Heading hierarchy — agents understand document sections
  • Tables — fully reconstructed, not flattened into prose
  • Formulas and diagrams — converted to LaTeX or described accurately
  • Multi-column layouts — reading order maintained
The result: your agent gives accurate, context-aware answers instead of hallucinating from garbled text.

Limits

ConstraintLimit
Max file size200 MB
Max pages per file300
QPS per account1