extract-data

Here are 226 public repositories matching this topic...

opendatalab / MinerU

A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具，支持PDF/网页/多格式电子书提取。

python pdf parser ocr pdf-converter extract-data document-analysis pdf-parser layout-analysis ai4science pdf-extractor-rag pdf-extractor-llm pdf-extractor-pretrain

Updated Aug 1, 2024
Python

bda-research / node-crawler

Star

Web Crawler/Spider for NodeJS + server-side jQuery ;-)

nodejs javascript jquery crawler spider cheerio extract-data

Updated Aug 1, 2024
TypeScript

This repository contains my team's internship project work at Flexbox Technologies. We have developed a system that fills the patient details form automatically with the patient data extracted from pdf file.

pdf pdf-converter python3 docx pptx extract-data gemma form-filling qa-automation huggingface-transformers streamlit-application llms langchain flan-t5 faiss-vector-database

Updated Aug 1, 2024
Python

ShubhRanpara / Auto-Filler-Web

Star

This repository contains my internship project work at Flexbox Technologies. I have developed a system that fills the patient details form automatically with the patient data extracted from pdf file.

qa json automation pdf-converter python-3 pdf-document extract-data html-css-javascript form-filler medical-application patient-data docx-files pptx-files huggingface-transformers streamlit-webapp llms langchain flan-t5 faiss-vector-database

Updated Aug 1, 2024
HTML

meltano / meltano

Star

Meltano: the declarative code-first data integration engine that powers your wildest data and ML-powered product ideas. Say goodbye to writing, maintaining, and scaling your own API integrations.

Updated Aug 1, 2024
Python

MeltanoLabs / tap-dbt

Star

Singer Tap for dbt API v2 built with the Meltano SDK

dbt elt extract-data singer-io singer-tap meltano-sdk dbt-cloud

Updated Jul 31, 2024
Python

pdfix / pdfix_sdk_example_java

Star

PDFix SDK samples for Java Maven. PDF manipulation, content extraction, conversion , accessibility and more...

Updated Jul 31, 2024
Java

laur89 / docker-seedbox-rclone-fetch-extract

Star

Dockerised service pulling data from remote seedbox & extracting archives

torrent downloader unzip extractor sonarr plex extract unraid seedbox rclone emby radarr extract-data unrar data-downloader jellyfin servarr

Updated Jul 29, 2024
Shell

MeltanoLabs / tap-stackexchange

Star

Singer tap for the StackExchange API

elt extract-data stackexchange singer-io singer-tap meltano-sdk

Updated Jul 29, 2024
Python

guillaC / SQLiteDiskExplorer

Sponsor

Star

SQLiteDiskExplorer enables you to explore, catalog, and batch extract SQLite files from disks and removable media.

information-retrieval sql analysis sqlite extractor sqlite-database information-extraction analyzer database-management infosec sqlite3 dear-imgui extract-data forensic-analysis forensic filescan dearimgui filescanner forensics-tools

Updated Jul 28, 2024
C#

ai92-github / llm-reader

Star

turn webpage to LLM friendly input text. Makes image & webpage links extraction easy.

scraper scraping extract-data webscraping scraping-websites llm llm-agent

Updated Jul 27, 2024
Python

pymupdf / PyMuPDF

Star

PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.

python pdf font data-science ocr tesseract epub mupdf text-processing pdf-documents extract-data table-extraction text-shaping xps pymupdf

Updated Jul 26, 2024
Python

pdfix / pdfix_sdk_example_dotnet

Star

Make PDF Files Accessible, Extract Data from PDF, Convert PDF to HTML, Fill-in PDF Form, Stamp PDF and more...

Updated Jul 31, 2024
C#

geanpannellini / real_estate_property_transactions

Star

A repository containing comprehensive data on real estate property transactions, encompassing transaction details, property characteristics, and market insights for analytical purposes in the real estate industry.

sql data-visualization data-structures dbt extract-data

Updated Jul 19, 2024

elixir-crawly / crawly

Star

Crawly, a high-level web crawling & scraping framework for Elixir.

crawler scraper erlang elixir spider scraping crawling extract-data scraping-websites

Updated Jul 4, 2024
Elixir

Anjali1751 / Extracting-data-of-scanned-images

Star

Extracting Data Of Scanned Images

flask data images extract-data input-output

Updated Jun 30, 2024
Python

Warard / WordExtractor

Star

Python program which extracts some data from a specific Word document used in my company. Without this program data used to be extracted manually, opening hundred of Word documents one by one to copy/past some informations on an Excel file. Now it is fully automatic.

word docx document extract-data

Updated Jun 27, 2024
Python

DevExpress-Examples / wpf-dashboard-how-to-update-extract-data-source-file

Star

This example demonstrates how to update the extract data file at runtime.

wpf extract desktop business-intelligence extract-data data-access wpf-dashboard dashboard-for-wpf

Updated Jun 25, 2024
C#

DevExpress-Examples / winforms-dashboard-extract-data-source

Star

This example demonstrates how to create the Extract data source, replace existing dashboard data sources with Extract data sources and update the Extract data file.

dotnet winforms desktop business-intelligence extract-data data-access winforms-dashboard dashboard-for-winforms winforms-dashboard-viewer

Updated Jun 25, 2024
C#

LivingSkySchoolDivision / MySchoolSaskIntegrations

Star

Export definitions, and notes regarding how they work, for extracting data from MySchoolSask (an implementation of Follett Aspen)

integrations csv integration xml extract-data aspen mss vig follett myschoolsask

Updated Jun 24, 2024
PowerShell

Improve this page

Add a description, image, and links to the extract-data topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the extract-data topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

extract-data

Here are 226 public repositories matching this topic...

opendatalab / MinerU

bda-research / node-crawler

ShubhRanpara / Auto-Filler

ShubhRanpara / Auto-Filler-Web

meltano / meltano

MeltanoLabs / tap-dbt

pdfix / pdfix_sdk_example_java

laur89 / docker-seedbox-rclone-fetch-extract

MeltanoLabs / tap-stackexchange

guillaC / SQLiteDiskExplorer

ai92-github / llm-reader

pymupdf / PyMuPDF

pdfix / pdfix_sdk_example_dotnet

geanpannellini / real_estate_property_transactions

elixir-crawly / crawly

Anjali1751 / Extracting-data-of-scanned-images

Warard / WordExtractor

DevExpress-Examples / wpf-dashboard-how-to-update-extract-data-source-file

DevExpress-Examples / winforms-dashboard-extract-data-source

LivingSkySchoolDivision / MySchoolSaskIntegrations

Improve this page

Add this topic to your repo