API – [application programming interface] an API defines the correct way for a developer to write a program that requests services from an operating system or an other application

Artificial Intelligence – developing intelligent machines and software that are capable of perceiving the environment and take corresponding action when required and even learn from those actions


Big Dataa massive amount of data, difficult to analyze using common database techniques, demanding cost-effective, innovative forms of information processing to enable data driven decision making

Business Intelligence – the theories, methodologies and processes to make data, particularly business-related data, understandable and more actionable


Cloud – a distributed computing system over a network used for storing data off-premises


Data Aggregation – the process of transforming scattered data from numerous sources into a single new one

Data Cleansing – the process of reviewing and revising data in order to delete duplicates, correct errors and provide consistency

Data Consolidation – the process of combining multiple sources of linked data into one larger source called a dataset

Data Connector – a Data Connector transfers data by using data source definitions and data transfer rules, that define  the parameters for working with a particular data source

Data Extraction – the process of retrieving data out of structured and unstructured data sources for further data processing or data storage

Data Mining – the process of discovering patterns in large data sets where the goal is to extract information and transform it into an understandable structure for further use

Data Set – a collection of related sets of information usually presented in tabular form

Data Transformation – converts a set of data values from the data format of a source data system into the data format of a destination data system by mapping it correctly

Database Normalisation – the process of organizing the columns (attributes) and tables (relations) of a relational database to minimize data redundancy

Distributed System – a piece of software that ensures the communication between a collection of independent computers so that they appear to its users as a single coherent system


Elasticsearch – an open source, java-based search engine built on top of Apache Lucene and released under Apache license


Hadoop – an open-source framework that is built to enable the process and storage of big data across a distributed file system


Indexing – the act of classifying and providing an index in order to make items easier to retrieve


Knowledge Extraction – the creation of knowledge from structured (relational databases, XML) and unstructured (text, documents, images) sources

Knowledge Graph – a knowledge base used by Google to enhance its search results with structured information about an entity (people, person, places, brands etc.) gathered from a wide variety of sources


Machine learning – part of artificial intelligence where machines learn from what they are doing and become better over time

MapReduce – a software framework for processing and generating large data sets with a parallel, distributed algorithm on a Hadoop cluster


N-gram – set of co-occurring items (syllables, letters, words) within a given document used in text mining and NLP

NLP –  [Natural Language Processing] a field of computer science involved with interactions between computers and human languages

Nutch – an open source web crawler software project coded in Java and released under the Apache license


SaaS – a software delivery method that provides access to software and its functions remotely as a Web-based service and which is  licensed on a subscription basis

Search Engine – a program that searches for and identifies items in a database or a website that correspond to keywords specified by the user

Semantic Search – seeks to improve search accuracy by not only finding keywords but by understanding the searcher’s intent and the contextual meaning of terms

Social Graph – a graph or a map that illustrates interconnections among people, groups and organizations in a social network


Web crawler – a program that visits Web sites and reads their pages and other information in order to create entries for a search engine index

Social media & sharing icons powered by UltimatelySocial