Open Source Text Processing Software - Page 2

Text Processing Software

View 91 business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Simple, Secure Domain Registration Icon
    Simple, Secure Domain Registration

    Get your domain at wholesale price. Cloudflare offers simple, secure registration with no markups, plus free DNS, CDN, and SSL integration.

    Register or renew your domain and pay only what we pay. No markups, hidden fees, or surprise add-ons. Choose from over 400 TLDs (.com, .ai, .dev). Every domain is integrated with Cloudflare's industry-leading DNS, CDN, and free SSL to make your site faster and more secure. Simple, secure, at-cost domain registration.
    Sign up for free
  • 1
    PDF-Shuffler
    PDF-Shuffler is a small python-gtk application, which helps the user to merge or split pdf documents and rotate, crop and rearrange their pages using an interactive and intuitive graphical interface. It is a frontend for python-pyPdf.
    Leader badge
    Downloads: 69 This Week
    Last Update:
    See Project
  • 2

    abnTeX

    abnTeX está em novo endereço: http://www.abntex.net.br

    ATENÇÃO: abnTeX está em novo endereço: http://www.abntex.net.br abnTeX is a set of LaTeX macros that follow the rules of ABNT (Brazilian Association of Technical Standards). ### abnTeX é um conjunto de macros LaTeX que segue as regras da ABNT (Associação Brasileira de Normas Técnicas). O projeto foi totalmente reconstruído baseado nas novas regras da ABNT e utilizando uma nova técnica para produção da classe, mais informações em: Portal do projeto: http://www.abntex.net.br Grupo de desenvolvedores: http://groups.google.com/group/abntex2 Esta página no SourceForge contém os ativos do projeto original, hospedados originalmente no portal CodigoLivre.org
    Leader badge
    Downloads: 137 This Week
    Last Update:
    See Project
  • 3
    HarfBuzz

    HarfBuzz

    Open source text shaping engine

    HarfBuzz is an open source text-shaping engine with a C API that turns fonts and strings of character codes into a form that is correctly arranged for the corresponding language and writing system. This is essentially the process of text shaping: translating a string of character codes into a properly arranged sequence of glyphs that can be rendered onto a screen or into final output form for inclusion in a document. This shaping depends on a number of factors: the input string, the active font, the script (or writing system) of the string, and the string's language. Various font formats have their own set of standard text-shaping rules. With Harfbuzz, you can properly shape all the major writing systems. HarfBuzz is cross-platform and supports all major software platforms and font formats.
    Downloads: 13 This Week
    Last Update:
    See Project
  • 4
    regexxer
    regexxer is a nifty GUI search/replace tool featuring Perl-style regular expressions. If you need project-wide substitution and you're tired of hacking sed command lines together, then you should definitely give regexxer a try.
    Leader badge
    Downloads: 68 This Week
    Last Update:
    See Project
  • Level Up Your Cyber Defense with External Threat Management Icon
    Level Up Your Cyber Defense with External Threat Management

    See every risk before it hits. From exposed data to dark web chatter. All in one unified view.

    Move beyond alerts. Gain full visibility, context, and control over your external attack surface to stay ahead of every threat.
    Try for Free
  • 5
    Ada Class Library

    Ada Class Library

    Ada Class Library - an object orientated library for Ada.

    Text search and replace. Scripting (small tool programs). CGI scripts. Execution of external programs (incl. I/O redirection). Garbage Collection. Extendended Booch Components. CD-Recorder
    Leader badge
    Downloads: 285 This Week
    Last Update:
    See Project
  • 6
    iText®, a JAVA PDF library

    iText®, a JAVA PDF library

    PDF Library for Developers

    iText is an open-source PDF library available for Java and .NET (C#). iText allows you to effortlessly generate and manipulate standards-compliant PDF documents with a powerful and feature-rich SDK. With iText, you can create archivable and accessible PDFs, split and merge documents, fill and flatten forms, digitally sign documents, and more. iText add-ons enable additional functionality, such as PDF creation from HTML templates, secure redaction, OCR, and much more. The latest versions of iText build on the success of previous versions and feature an improved document engine, high and low-level programming capabilities, and a more efficient modular structure. iText represents the next level for developers looking to leverage PDF in document workflows. The main project page for iText is now on GitHub, and all the latest releases, code samples, open source add-ons and tools, etc. can be found at https://github.com/itext/.
    Leader badge
    Downloads: 269 This Week
    Last Update:
    See Project
  • 7
    ANTLR

    ANTLR

    Parser generator to read, process, or translate structured text

    ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files. It's widely used to build languages, tools, and frameworks. From a grammar, ANTLR generates a parser that can build and walk parse trees. It’s widely used in academia and industry to build all sorts of languages, tools, and frameworks. Twitter search uses ANTLR for query parsing, with over 2 billion queries a day. The languages for Hive and Pig, the data warehouse and analysis systems for Hadoop, both use ANTLR. Lex Machina uses ANTLR for information extraction from legal texts. Oracle uses ANTLR within SQL Developer IDE and their migration tools. NetBeans IDE parses C++ with ANTLR. The HQL language in the Hibernate object-relational mapping framework is built with ANTLR.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 8
    Command-line/Ant-task/embeddable text file preprocessor. Macros, flow control, expressions. Recursive directory processing. Extensible in Java to display data from any data sources (as database). Can generate complete homepages (tree of HTML-s, images, etc.)
    Leader badge
    Downloads: 88 This Week
    Last Update:
    See Project
  • 9
    Find And Replace Text command line utility. New & improved version of the well-known grep command, with advanced features such as: case-adaption of the replace string; find (& replace) in filenames, auto CVS edit. Moved to https://github.com/lionello/fart-it
    Leader badge
    Downloads: 55 This Week
    Last Update:
    See Project
  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    Build gen AI apps with an all-in-one modern database: MongoDB Atlas

    MongoDB Atlas provides built-in vector search and a flexible document model so developers can build, scale, and run gen AI apps without stitching together multiple databases. From LLM integration to semantic search, Atlas simplifies your AI architecture—and it’s free to get started.
    Start Free
  • 10
    IvriTeX is a project spunned off heblatex and it's purpose is to maintain the Hebrew LaTeX support, and provide a meeting point for Hebrew TeXers for the coordination of improving the Hebrew support.
    Leader badge
    Downloads: 123 This Week
    Last Update:
    See Project
  • 11
    FCKeditor

    FCKeditor

    FCKeditor (retired)

    FCKeditor is the previous version of CKEditor and has been discontinued after version 2. The new CKEditor is redesigned from the ground up, offering more WYSIWYG text editing features, enhanced security and better integration. Don’t force yourself with retro FCKeditor. Switch to the new, cool CKEditor at ckeditor.com
    Downloads: 34 This Week
    Last Update:
    See Project
  • 12

    ConcatPDF

    PDF Concatenation Tool

    ConcatPDF is the tool to concatenate PDF files. It can concatenate, extract, encrypt, decrypt, configure PDF files, convert image files to PDF. GUI version and CUI version are both available. iText.NET is iText porting on .NET Framework by J#. This library allows you to generate PDF, (X)HTML, XML, RTF files on Microsoft.NET Framework including ASP.NET.
    Leader badge
    Downloads: 52 This Week
    Last Update:
    See Project
  • 13
    LMG2Shruti is a free non unicode to unicode font converter. It converts the LMG Arun font to Gujarati unicode Shruti font.
    Leader badge
    Downloads: 110 This Week
    Last Update:
    See Project
  • 14
    meld-installer

    meld-installer

    Meld Installer for Windows

    Bundles Portable Python (with PyGTK) and Meld together in an easy to use installer. This allows you to not have to worry about setting up Python or PyGTK and you can keep Meld's Python separate from other Python installations on your machine. ** NOTE ** Meld 3.11 and later now have official installers, hence this project is no longer supported. You can download the new installer here: https://download.gnome.org/binaries/win32/meld/. You should uninstall the old 1.8 version before upgrading.
    Downloads: 49 This Week
    Last Update:
    See Project
  • 15
    Stanford CoreNLP

    Stanford CoreNLP

    Stanford CoreNLP, a Java suite of core NLP tools

    CoreNLP is your one stop shop for natural language processing in Java! CoreNLP enables users to derive linguistic annotations for text, including token and sentence boundaries, parts of speech, named entities, numeric and time values, dependency and constituency parses, coreference, sentiment, quote attributions, and relations. CoreNLP currently supports 6 languages, Arabic, Chinese, English, French, German, and Spanish. The centerpiece of CoreNLP is the pipeline. Pipelines take in raw text, run a series of NLP annotators on the text, and produce a final set of annotations. Pipelines produce CoreDocuments, data objects that contain all of the annotation information, accessible with a simple API, and serializable to a Google Protocol Buffer. CoreNLP generates a variety of linguistic annotations, including parts of speech, named entities, dependency parses, and coreference.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 16
    Tinn-R

    Tinn-R

    Tinn-R Editor - GUI for R Language and Environment

    The Tinn-R is an open source (GNU General Public License) project. It is an editor/word processor ASCII/UNICODE generic for the Windows operating system, very well integrated into the R, with characteristics of Graphical User Interface (GUI) and Integrated Development Environment (IDE). Project leader and main developer: José Cláudio Faria/UESC/DCET. LANGUAGE: Object Pascal, IDE: DELPHI 2007.
    Downloads: 27 This Week
    Last Update:
    See Project
  • 17
    The Guide
    The Guide is a tree-based information management tool. It lets you to organize information as nodes in a tree. (A two-pane rich-text outliner for Windows.)
    Downloads: 31 This Week
    Last Update:
    See Project
  • 18
    OOoFBTools

    OOoFBTools

    Open/Libre Office extension for converting eBooks in fb2 format

    Open/Libre Office extension for converting and processing eBooks in FictionBook2 format with validator. Apache OpenOffice Extensions page: http://extensions.openoffice.org/en/project/ooo-fbtools Libre Office Extensions page: http://extensions.libreoffice.org/extension-center/fbtools Внимание! Приглашаются разработчики! dikbsd последние годы тяжело тянуть даже текущее сопровождение. И есть интерес передать проект в хорошие руки, не доводя до кризиса, а с передачей опыта.
    Downloads: 37 This Week
    Last Update:
    See Project
  • 19
    FAR - Find And Replace
    Search and replace operations on file content accross multiple files. Recursive operations within entire directory trees. FAR comes with support for regular expressions (regex) over multiple lines, automatic backup and various character encodings. Run grep like extractions to condense or rearrange sources, or perform bulk file renaming.
    Downloads: 30 This Week
    Last Update:
    See Project
  • 20
    Pdftohtml is a tool based on the Xpdf package which translates pdf documents into html format.
    Downloads: 28 This Week
    Last Update:
    See Project
  • 21
    The DITA Open Toolkit is an implementation of the OASIS DITA XML Specification. The Toolkit transforms DITA content into many deliverable formats. See https://www.dita-ot.org/ for documentation and links to downloads. The source code and issue trackers have been moved to https://github.com/dita-ot/dita-ot
    Downloads: 25 This Week
    Last Update:
    See Project
  • 22
    CONVERTCP

    CONVERTCP

    Text File Codepage Converter for the Windows command line

    This command line utility is a codepage converter to be used to change the character encoding of text. It fully supports charsets such as ANSI code pages, UTF-8, UTF-16 LE/BE, UTF-32 LE/BE, and EBCDIC. It's designed to convert big text files, too. It runs on Windows XP onwards (tested on XP, Windows 7, Windows 8.1, Windows 10, and Windows 11). The "readme.txt" file and the Wiki gives you some more information. You'll find the compiled tool for 32 bit (x86) and 64 bit (x64) Windows in the "bin" directory. The C source code is available in the "src" directory. Just click on the "Files" tab. Regardless if you have or don't have a SourceForge account - whenever you have questions about CONVERTCP or you want to give feedback then you are welcome to post it in the forum. Click on the "Discussion" tab.
    Leader badge
    Downloads: 48 This Week
    Last Update:
    See Project
  • 23
    Vim provides a rich set of tools which makes generating latex easy, pain-free and quite pleasurable. This web-site aims at bringing together the rich set of tools the vim community has produced over the years into a central repository
    Downloads: 28 This Week
    Last Update:
    See Project
  • 24
    XSLT syntax highlighting

    XSLT syntax highlighting

    Java based XSLT Processor extension for syntax highlighting

    Please note that project moved to GitHub: https://github.com/xmlark/xslthl This is an implementation of syntax highlighting as an extension module for XSLT processors (Xalan, Saxon), so if you have e.g. article about programming written in DocBook, code examples can be automatically syntax highlighted during the XSLT processing phase.
    Leader badge
    Downloads: 45 This Week
    Last Update:
    See Project
  • 25
    Vrapper

    Vrapper

    Vim-like editing in Eclipse

    Vrapper is an eclipse plugin which acts as a wrapper for existing eclipse text editors to provide a Vim-like input scheme for moving around and editing text. Eclipse Update Site: http://vrapper.sourceforge.net/update-site/stable
    Downloads: 20 This Week
    Last Update:
    See Project