Example Any words: php_solr

Example All words: Simple PHP Agenda


Full working blog-platform home page with editorial blog post with comment and tags complete admin section mysql fulltext ajax search section/user/links pages

publishing platform blog php


EOS is a component-based development platform, a business-oriented middleware. This project is focus on integrating OpenSource application with EOS, such as Fulltext Search, General Data Query, General Office Automation...,etc.

enterpriseapplication j2ee open java web eos

Similar Entries

A module that displays a block with the most similar nodes to the currently viewed one, based on the title and body fields. Related pages show as a list in a block. Similar Entries supports only MySQL-based sites, because it MySQL's FULLTEXT indexing for MyISAM tables. FULLTEXT is a special query that helps find relevant content in other nodes using a natural language search that interprets the search string as a phrase in natural human language. The installer adds the FULLTEXT index to your node revision table.


Documentation in progress Lib Dependenciescommons-lang-2.4.jar google-collect-1.0-rc2.jar commons-logging.jar SampleSimple Search Capable Model package org.gaesearch.sample; @PersistenceCapable(identityType = IdentityType.APPLICATION) @SearchCapable public class Home { @PrimaryKey @Persistent(valueStrategy = IdGeneratorStrategy.IDENTITY) @SearchKey // Note: currently only supports Long id key private Long id; @Persistent @SearchField("address") private String address1; @Persistent @SearchField("address") private String address2; @Persistent @SearchField("address") private String address3; @Persistent @SearchField private String area; @Persistent @SearchField private String city; // getter & setters }Setup Simple Index Service public class SimpleIndexService { private IndexSearchEngine indexSearchEngine = new GIndexSearchEngine("org.gaesearch.sample.Home"); // object to be indexed. public void index(Object obj) { Map settings = new HashMap(); settings.put(IndexSearchEngine.OBJECT, obj); settings.put(IndexSearchEngine.PMF, ObjectFactory.getPMF()); // set PMF // do indexing. indexSearchEngine.index(settings); } // object to be unindexed. public void unIndex(Object obj) { Map settings = new HashMap(); settings.put(IndexSearchEngine.OBJECT, obj); settings.put(IndexSearchEngine.PMF, ObjectFactory.getPMF()); // set PMF // do unindexing. indexSearchEngine.unIndex(settings); } }Simple Search Service public class SimpleSearchService { private IndexSearchEngine indexSearchEngine = new GIndexSearchEngine("org.gaesearch.sample.Home"); public SearchServiceImpl() { // set persistence manager factory. this.indexSearchEngine.setPersistenceManagerFactory(ObjectFactory.getPMF()); // set PMF } public List searchAddress(String text) { return indexSearchEngine.search(getSettings("address", text)); } public List searchCity(String text) { return indexSearchEngine.search(getSettings("city", text)); } public List searchAll(String text) { return indexSearchEngine.search(getSettings(null, text)); } private Map getSettings(String type, String text) { Map settings = new HashMap(); settings.put(IndexSearchEngine.OBJECT, new SearchBy(Home.class, type, text)); return settings; } }

fulltext search gae


MongoDB fulltext search implementation using the xmlpipe2 interface of the Sphinx fulltext search engine


Yet another fulltext search engine,


SummaryThe boilerpipe library provides algorithms to detect and remove the surplus "clutter" (boilerplate, templates) around the main textual content of a web page. The library already provides specific strategies for common tasks (for example: news article extraction) and may also be easily extended for individual problem settings. Extracting content is very fast (milliseconds), just needs the input document (no global or site-level information required) and is usually quite accurate. Boilerpipe is a Java library written by Christian Kohlschütter. It is released under the Apache License 2.0. The algorithms used by the library are based on (and extending) some concepts of the paper "Boilerplate Detection using Shallow Text Features" by Christian Kohlschütter et al., presented at WSDM 2010 -- The Third ACM International Conference on Web Search and Data Mining New York City, NY USA. Click here to read the paper and the presentation slides News(2009-01-30) boilerpipe 1.0.3 Two bug fixes (XML parsing issues). Issues #1 and #2. (Thanks to Tom Taylor, Kaspar Fischer and nedunk for reporting the problems) (2009-12-10) boilerpipe 1.0.2 This release hot-fixes a NekoHTML bug which caused low-quality results in a rare situation. (Thanks to Kris Jirapinyo for reporting the problem) (2009-12-04) boilerpipe 1.0.1 Added the dependency libs (xerces and nekohtml) and the javadocs to the binary tarball. (Thanks to Mike Matthews for reporting the problem) (2009-12-03) boilerpipe 1.0.0 The code is now online. Have fun! Getting StartedTo get started, see the documentation in the Wiki and the binary and source tarballs. Please also read the FAQ, it contains important information. About the AuthorChristian Kohlschütter is currently working at the L3S Research Center. He is a PhD student of Professor Dr. Wolfgang Nejdl. His main research interests are in the area of Web Information Retrieval and Quantitative Linguistics.

webpage template boilerplate html web text full removal extraction library cleaning content fulltext java


acts_as_mysqlsearchable makes MySQL fulltext search easy available directly in your models


Support to MySQL fulltext search syntax in CakePHP Model::find method.


Sphider is a lightweight web spider and search engine written in PHP, using MySQL as its back end database. It is a great tool for adding search functionality to your web site or building your custom search engine. Sphider is small, easy to set up and modify, and is used in thousands of websites across the world. Sphider supports all standard search options, but also includes a plethora of advanced features such as word autocompletion, spelling suggestions etc. The sophisticated adminstration interface makes administering the system easy.

searchengine spider search-engine search_engines fulltext-search search indexing php search_engine


CustomHelpIDE Expert for Delphi 2005+ which extends the Delphi help with some useful features. Features: Help via F1 key Search in all installed Hx namespaces (Microsoft Developer Help) (e.g. Jedi Help) Search for help in the old (*.hlp) Windows help files (e.g. Delphi 7 help) Search in Html-Help (*.chm) Files Search using Index of fulltext NPipe 1.0Named Pipe Objects for Delphi [6|7|2005|2006|2007] Features: full featured client and server component for named pipes access enables interprocess communication even within the network for a further description: see the wiki ResEd 1.6.7Project Resource Editor for Delphi [2005|2006|2007|2009|2010] Features: full IDE integration easily access and manage the resource files contained in your project adding/deleting/renaming of

cna customhelp resed tuo tuoscriptpak tsettings keyboardled delphi


This software allows the automated collection of large numbers of full text articles. You can use the GUI to perform a PubMed search and then download the PDFs from the search results, or you can work with the API and just give it a set of starting point URLs or PubMed IDs. Getting, Installing and Running the software1. Please check the requirements for this software 2. Please download the appropriate zip file for your operating system (please note, PowerPC Macs are not currently supported). 3. Decompress the zip file using software appropriate for your operating system (i.e. Mac: Archive Utility, Linux: gunzip, Windows: WinZip or 7Zip) 4. Open a terminal window and navigate to the directory where you decompressed the zip file (i.e. cd c:\articledownloader ) 5. Then run the application using the run script, which is called run.sh for Linux and Mac and run.bat in Windows. In Windows you will probably be able to double click the run.bat file in Explorer 6. When you have got it working, you can have a look at this page for a walkthrough on how to actually download some articles. DocumentationIf you want further information on this software you can either download the Javadoc archive in the Downloads section, you can also look at the documentation in the Wiki section or the featured Wiki pages listed on the right-hand side of this page. An overview ProblemsIf you are having problems, please look at this page. Or file an Issue so others can benefit from our attempts to resolve the problem.

scientificliterature pdfdownload fulltext webagent


acts_as_xapian plugin converted to a gem. Provides integration with the xapian fulltext search engine.


Pyndexter provides a uniform API for accessing a variety of full-text search and indexing engines. It aims to be to full-text indexing systems what the Python DB API is to databases. It presents a uniform query syntax to the user, with support for quoted search terms, boolean operations, sub-expressions and attribute (metadata) querying. Indexers supported are a basic but functional pure-Python indexer, adapters for Hype, Hyperestraier, Lucene, Lupy, Pyndex, Swish-e and Xapian.

search_engine search indexing indexer fulltext-search search-engine information_retrieval searchengine


Groonga - fulltext search engine.


Flax is a project to develop an open source enterprise search engine application based on the Xapian search engine library. It also contains a clean-and-simple Python interface suitable for many users of Xapian, built on the standard Xapian Python interface, together with various other add-ons such as performance testing utilities.

searchengine information_retrieval search-engine indexing search xapian fulltext-search indexer search_engine enterprise


HITEC is a software package for very high accuracy automatic text categorization . The engine of HITEC is the implementation of UFEX (Universal Feature EXtractor) for textual documents. UFEX is a very sophisticated learning method that ensures the outstanding categorizing performance of HITEC, hence HITEC outperforms its competitors in case of all investigated document collections. (For further details, read the white paper). HITEC applies supervised learning method, that is it learns based on training data (learning phase), and is able to classify new documents to known categories (operational phase). Obviously, the performace of categorization strongly depends on the quality of training data. For efficient training HITEC requires - fixed category system (usually ordered in hierarchy); during the operational phase the new, unknown documents will be classified into that system; - some relevant training documents for each category of the category system. During the operation, HITEC returns an ordered list of most relevant categories for unknown documents based on confidence values. The greater is this value HITEC deems the more relevant the corresponding category to the document. The returned list if documents can be further processed depending on the nature of classification problem. If perfect accuracy is required for the classification, an expert can accept, revise, or reject categories proposed by HITEC. If the accuracy of around 90\% having been experienced at tests is sufficient, then proposed categories can be accepted based upon their confidence value. HITEC is programmed very efficiently, therefore its high performace comes with fast operation even on very large document collections. Once the training of HITEC has been done for a document collection, the operation phase is performed in real-time (see also test pages). It is able to process hunderds of gigabytes in reasonable time (training phase) and work with thousands of categories on an average PC.

search_engine information_analysis fulltext-search indexer search information_retrieval

Search plugin for Bazaar

This plugin provides the ability to index the contents of a branch and then perform fast queries on the fulltext index. It is alpha-quality, which is to say that it works for many users and bug reports are appreciated. The index allows searching on the contents of commit messages, the paths present in the various commits, and the content of files. File content indexing is diffbased for better search results.