How to Import PDF into Excel: Complete Step-by-Step Guide for Power Query, Adobe, and Manual Methods
Learn how to import PDF into Excel using Power Query, Adobe Acrobat, and manual methods. Step-by-step tutorial with tips for tables, scanned PDFs, and more.

Learning how to import PDF into Excel has become an essential skill for anyone who works with data, financial reports, invoices, or tabular information trapped inside PDF documents. Whether you are a financial analyst extracting quarterly statements, a sales manager compiling vendor invoices, or a student gathering research data, the ability to convert PDF tables into editable Excel spreadsheets saves hours of manual retyping. Excel 365 and Excel 2021 now include a built-in PDF connector through Power Query that handles this conversion natively.
The process has evolved significantly over the past five years. Microsoft introduced the Get Data from PDF feature in 2020, making Excel one of the few spreadsheet applications with native PDF parsing capabilities. Before this update, users had to rely on third-party converters, Adobe Acrobat Pro subscriptions, or tedious copy-paste workflows that often destroyed table formatting. Today, you can import a multi-page PDF containing dozens of tables in under sixty seconds with proper formatting preserved.
This comprehensive guide covers every method available for importing PDF data into Excel, from the modern Power Query approach to legacy techniques that still work well for specific scenarios. You will learn which method suits scanned documents versus text-based PDFs, how to handle password-protected files, and what to do when tables span multiple pages. We will also address common formatting issues that arise during import and the troubleshooting steps that resolve them.
Power Query, the engine behind Excel's PDF import feature, uses the same M language that powers data transformations throughout Microsoft's Power Platform ecosystem. Once your PDF data lands in Excel, you can clean it, merge it with other sources, and refresh the connection whenever the source PDF updates. This refresh capability transforms one-time imports into automated data pipelines, particularly valuable for recurring reports like monthly bank statements or weekly sales summaries.
Beyond the technical mechanics, choosing the right import method depends on your PDF's structure. Text-based PDFs created from Word or Excel exports import cleanly, while scanned image PDFs require optical character recognition (OCR) before any data extraction can occur. Knowing the difference upfront prevents frustration and ensures you select the appropriate tool. Adobe Acrobat Pro, ABBYY FineReader, and online converters like Smallpdf each fill specific niches in the PDF-to-Excel workflow.
Throughout this article, you will find step-by-step instructions, screenshots descriptions, real-world examples, and practical tips drawn from accountants, data analysts, and Excel power users. We will compare native Excel tools against premium alternatives so you can make informed decisions about which workflow fits your budget and technical comfort level. By the end, you will confidently extract structured data from any PDF and integrate it into your Excel reporting workflows.
If you are looking to test your broader Excel skills after mastering PDF imports, practice quizzes can reinforce concepts like data cleaning, formulas, and table manipulation. The techniques covered here pair naturally with VLOOKUP, INDEX/MATCH, and Power Query transformations that you will likely apply once your PDF data lives inside Excel. Let's dive into the methods, starting with the fastest native approach.
PDF to Excel Import by the Numbers

PDF Import Methods Overview
Identify PDF Type
Choose Import Method
Execute Import
Clean and Transform
Load to Worksheet
Refresh and Maintain
Power Query represents the gold standard for importing PDF data into Excel because it preserves structure, supports refresh operations, and requires no additional software beyond Excel 365 or Excel 2021. The feature lives under the Data tab and processes PDFs through the same engine that handles SQL databases, CSV files, and web scraping. Microsoft built this capability after recognizing that millions of users were resorting to third-party tools for a workflow Excel should handle natively.
To start the Power Query PDF import process, open a blank Excel workbook and click the Data tab on the ribbon. Look for the Get Data dropdown on the far left side of the ribbon. Click it, hover over From File, and select From PDF from the submenu. A standard file browser dialog appears, allowing you to navigate to your PDF document. Select the file and click Import to begin the analysis process, which typically takes five to fifteen seconds depending on document size.
The Navigator window opens next, displaying every table and page Excel detected within your PDF. Tables appear with the prefix Table001, Table002, and so forth, while page-level views show as Page001, Page002. Click any item to preview its contents in the right pane. This preview helps you decide whether to import individual tables or entire pages. For documents with consistent table structures across pages, the Page view often provides cleaner results than fragmented Table views.
If your data looks ready to use, click Load to send it directly into a new worksheet. However, most real-world PDFs benefit from clicking Transform Data instead, which opens the Power Query Editor for cleanup. Inside the editor, you can remove empty rows, merge split columns, change data types from text to numbers, and apply dozens of other transformations. Each step records in the Applied Steps panel on the right, creating a repeatable workflow.
One powerful feature of Power Query is multi-table consolidation. If your PDF contains the same table structure repeated across many pages, you can append all tables into a single dataset. Use the Append Queries function under the Home tab in Power Query Editor, then select the tables you want to combine. This technique works brilliantly for bank statements where each page shows transactions in identical column layouts, letting you build a complete annual transaction history from a single PDF.
For users dealing with consistent data structures, consider how Power Query transformations work alongside the Excel FILTER function and other modern dynamic array functions. After loading PDF data into your worksheet, you can apply formulas referencing the loaded table by name, building dashboards that automatically reflect refreshed PDF imports. This creates living reports that update whenever you re-import an updated PDF version, eliminating manual rework cycles entirely.
Once your data lands in Excel, you may want to apply standard cleanup operations like removing duplicate entries or filtering specific records. Tools like Count Unique Values in Excel: COUNTUNIQUE, COUNTIF, and SUMPRODUCT Methods become useful when analyzing imported transaction lists. Power Query gives you the foundation, but downstream Excel features amplify what you can do with that newly accessible data.
Alternative Methods When VLOOKUP Excel and Power Query Are Not Enough
Adobe Acrobat Pro offers the most reliable PDF-to-Excel conversion for complex documents containing mixed content, irregular table layouts, or scanned pages requiring OCR. The Export PDF feature recognizes tables intelligently, preserves formatting, and handles password-protected files seamlessly. Subscription costs around twenty dollars monthly, but the accuracy on difficult documents often justifies the price for finance teams and accountants processing dozens of PDFs weekly.
To use Acrobat, open your PDF, click Export PDF in the right panel, select Spreadsheet as the export format, and choose Microsoft Excel Workbook. Acrobat analyzes the document, runs OCR on scanned content automatically, and produces an .xlsx file with one worksheet per detected page. Quality typically exceeds Power Query results on heavily formatted documents, especially those with merged cells, colored backgrounds, or footnotes that confuse simpler parsers.

Power Query PDF Import: Pros and Cons
- +Native Excel feature requires no additional software or subscriptions
- +Refreshable connections enable automated data pipelines from recurring PDFs
- +Transformations record automatically and reapply on every refresh
- +Handles multi-page documents and complex multi-table layouts efficiently
- +Free with Excel 365 or Excel 2021 license already purchased
- +Integrates with other Power Query sources for unified data workflows
- +Preserves data types when properly configured during transformation steps
- −Requires Excel 365 or 2021 — unavailable in older Excel 2019 and earlier
- −Cannot process scanned image PDFs without external OCR preprocessing
- −Struggles with merged cells and footnotes that break table detection
- −Initial transformation setup can be tedious for highly irregular documents
- −No support for password-protected PDFs without first removing protection
- −Multi-column page layouts sometimes confuse automatic table detection
- −Performance slows significantly on PDFs exceeding one hundred pages
Pre-Import Checklist Before You Import PDF into Excel
- ✓Verify the PDF contains selectable text by attempting to highlight individual words
- ✓Confirm you have Excel 365, Excel 2021, or Microsoft 365 subscription for native PDF support
- ✓Check that the document is not password-protected, or remove protection first
- ✓Close the PDF in any other application to prevent file lock conflicts
- ✓Save a backup copy of your PDF before any conversion attempts in case of issues
- ✓Identify which specific pages or tables you need rather than importing everything
- ✓Ensure your Excel workbook is saved before starting the import process
- ✓Update Excel to the latest version through File then Account then Update Options
- ✓Review the PDF structure for merged cells or unusual layouts that may need cleanup
- ✓Plan your worksheet destination — new workbook, existing sheet, or data model
Always use Transform Data instead of Load
Even when imported data looks perfect, click Transform Data instead of Load. This opens the Power Query Editor where you can verify data types, promote headers correctly, and add cleanup steps. These transformations save into your workbook and reapply automatically every time you refresh the connection, eliminating manual cleanup on every update cycle.
Scanned PDFs present a fundamentally different challenge than text-based PDFs because they contain images of text rather than actual text characters. When you scan a paper document or receive a faxed report converted to PDF, the resulting file looks identical to a text PDF on screen but contains no machine-readable characters. Excel's Power Query and most online converters fail silently on these files, producing empty results or random gibberish. Optical character recognition (OCR) bridges this gap by analyzing image pixels and inferring the underlying text.
Adobe Acrobat Pro includes built-in OCR that runs automatically when you export scanned PDFs to Excel. The Recognize Text feature processes images and embeds searchable text behind the visual content, transforming a scanned image into a hybrid document that behaves like a text PDF for extraction purposes. Quality depends on scan resolution — three hundred DPI minimum produces reliable results, while two hundred DPI scans introduce frequent character misreads especially around the digits one, seven, and lowercase L.
ABBYY FineReader specializes in high-accuracy OCR with table reconstruction capabilities exceeding Adobe's built-in features. The software costs around two hundred dollars for a perpetual license but delivers professional-grade results on degraded scans, multilingual documents, and complex table structures. Finance professionals processing historical archived documents often justify the cost through reduced manual correction time, which compounds quickly when handling hundreds of legacy paper records monthly.
Free OCR alternatives exist for occasional needs. Google Drive offers surprisingly capable OCR when you upload a PDF and open it with Google Docs — the resulting document contains extracted text you can copy into Excel. Microsoft OneNote provides similar functionality through its Copy Text from Picture feature on inserted images. Online services like OnlineOCR.net and NewOCR.com process individual files for free with daily limits, suitable for sporadic use but impractical for production workflows.
After OCR processing completes, treat the resulting file as you would any text-based PDF and import through Power Query. Expect to spend additional time cleaning the data because OCR introduces character recognition errors that text PDFs never contain. Common errors include zeros confused with capital O, ones confused with lowercase L, and decimal points misread as commas. Build a transformation step in Power Query that uses Replace Values to correct systematic OCR errors, then save this transformation for future imports.
For documents requiring extreme accuracy like financial statements or legal contracts, always perform a manual verification pass comparing OCR output against the original scanned PDF. Spot-check ten or twenty random data points across the document. If error rates exceed two percent, consider higher-resolution rescanning or premium OCR software. Some workflows require human verification for every imported number, particularly when downstream calculations affect tax filings or audit submissions where errors carry legal consequences.
Once data lives cleanly in Excel, you can apply standard analysis techniques including How to Add a Filter in Excel: AutoFilter, Advanced and FILTER Function to slice imported records by criteria like date ranges, vendor names, or transaction categories. The combination of OCR, Power Query, and Excel filtering transforms paper archives into queryable databases supporting modern data analysis workflows that were impossible just a decade ago.

Never upload PDFs containing personally identifiable information, financial account numbers, or confidential business data to free online converters. These services store uploaded files on their servers, sometimes indefinitely, and their privacy policies often permit data analysis or third-party sharing. Use Excel's native Power Query or licensed desktop software like Adobe Acrobat for any sensitive documents.
Troubleshooting PDF import issues requires systematic diagnosis because failures look similar but stem from different root causes. The most common problem is Power Query returning empty tables despite the PDF visibly containing data. This typically indicates a scanned image PDF rather than a text-based one — verify by attempting to select text in your PDF viewer. If selection fails or grabs entire blocks as images, you need OCR processing before any Excel import will succeed.
Column misalignment ranks as the second most frequent complaint. Tables appear in Excel with data shifted into wrong columns, headers merged into data rows, or numbers split across multiple columns. These issues usually result from PDFs using visual spacing rather than structured table tags. The PDF looked like a table on screen but technically rendered as positioned text fragments. Use Power Query's Split Column by Delimiter or by Number of Characters to manually reconstruct proper boundaries.
Date and number formatting problems plague almost every PDF import. Excel imports numeric values as text strings when the PDF contained formatting like currency symbols, thousands separators, or parenthetical negatives like (1,234.56). In Power Query, change the column data type to Currency or Decimal Number, then use Replace Values to strip dollar signs, commas, and parentheses. Wrap negative values in proper negative signs using conditional column logic when parentheses indicate negative amounts.
Multi-page tables that should append into a single dataset sometimes import as separate disconnected tables. Use Power Query's Append Queries feature to combine them, but verify column names and order match across all source tables first. If column names differ slightly between pages — perhaps Page Two used Date instead of Trans Date — rename columns consistently before appending. This step prevents the append from creating sparse tables with mostly empty columns containing scattered data.
Performance degradation on large PDFs frustrates users importing hundred-page documents. Power Query loads the entire PDF into memory before parsing, which strains systems with limited RAM. Workarounds include splitting the source PDF into smaller chunks using free tools like PDFsam, then importing chunks separately and appending in Power Query. Alternatively, increase Excel's memory allocation through Options or upgrade to a 64-bit Excel installation if currently running 32-bit.
For best practices going forward, document your transformation steps with comments inside Power Query Editor by right-clicking applied steps and selecting Properties to add descriptions. This documentation helps future you remember why specific transformations exist, especially for monthly recurring imports you might not touch for weeks. Save your Excel workbooks as templates when you build reusable PDF import workflows, allowing colleagues to leverage your work without rebuilding everything.
Finally, build error handling into your imports by using Power Query's Try and Otherwise constructs to handle missing values or unexpected data types gracefully. Combine PDF imports with other Excel features like Freeze Panes in Excel: Complete Guide to Locking Rows and Columns to create polished reports where imported PDF data displays alongside headers locked in place. These finishing touches separate professional-quality work from quick one-off conversions.
Implementing professional PDF-to-Excel workflows requires more than just knowing the technical steps — it demands strategic thinking about repeatability, accuracy verification, and integration with downstream Excel features. The most successful Excel users treat PDF imports as the first stage of a larger data pipeline rather than isolated one-time tasks. This mindset shift transforms how you approach every import and dramatically reduces the time spent on recurring monthly or quarterly reporting tasks.
Start by establishing a consistent folder structure for source PDFs and destination Excel workbooks. Create dedicated folders like PDF_Source_Files and Excel_Imports with subfolders organized by month or vendor. This organization becomes critical when Power Query connections reference file paths — moving files breaks refresh operations. Use UNC network paths rather than mapped drive letters for files stored on shared drives, since drive letter assignments vary between users while UNC paths remain consistent.
Build template workbooks for common PDF types you process regularly. If your accounting team processes vendor invoices in similar formats every month, create a master template containing pre-built Power Query transformations for each vendor's typical layout. New monthly imports then require only swapping the source file path rather than rebuilding transformation logic from scratch. This template approach scales beautifully across teams, with junior staff handling routine imports while seniors focus on exception cases requiring custom logic.
Document everything in a separate worksheet within your import workbook. Include sections for source file location, expected page count, known data quality issues, and step-by-step refresh instructions. Future colleagues inheriting your work — or your future self six months later — will thank you for this documentation. Include screenshots of correctly imported data so users can visually verify successful imports rather than guessing whether anything went wrong.
For high-volume environments processing dozens of PDFs daily, consider Power Automate flows that trigger Excel refreshes automatically when new PDFs arrive in a SharePoint folder. This integration between Microsoft 365 services creates fully automated pipelines where PDFs flow from email attachments through SharePoint into Excel reports with zero manual intervention. Setup requires moderate technical investment but pays dividends quickly in operations teams that previously dedicated multiple hours daily to manual data entry.
Test your workflows with intentionally corrupted or unusual sample PDFs to discover edge cases before they break production reports. Add a missing page, modify column orders, change date formats, and introduce typos to see how your Power Query transformations respond. Build defensive transformations that handle expected variations gracefully — for example, use conditional logic to detect whether dates appear in MM/DD/YYYY or DD/MM/YYYY format and apply appropriate parsing rules dynamically.
Finally, invest time in learning the underlying M language that powers Power Query transformations. While the graphical interface handles most common operations, M code enables advanced techniques like dynamic file path construction, parameterized queries, and complex conditional transformations impossible through clicks alone. Microsoft provides free documentation, and the broader Excel community shares M code snippets for common PDF parsing challenges through forums, blogs, and YouTube tutorials that accelerate your learning curve significantly.
Excel Questions and Answers
About the Author
Attorney & Bar Exam Preparation Specialist
Yale Law SchoolJames R. Hargrove is a practicing attorney and legal educator with a Juris Doctor from Yale Law School and an LLM in Constitutional Law. With over a decade of experience coaching bar exam candidates across multiple jurisdictions, he specializes in MBE strategy, state-specific essay preparation, and multistate performance test techniques.