Find the answers to these questions in the text.




Data Mining

Read the words and expressions and learn them if they are new to you:

Data mining – интеллектуальный анализ данных (добыча данных, раскопка данных)

A competitive edge – конкурентное преимущество

Regularities – закономерности

Intelligent guesses – разумные предположения

Deductive reasoning – дедуктивное мышление

Excitatory connection – возбуждающая связь (inhibitory – тормозящая)

Derive conclusions – выводить заключения

Clustering – кластеризация, разбиение на кластеры

Insurance company – страховая компания

Fraud – мошенничество

Claims – претензии

Decision trees – деревья принятия решений

Interesting patterns – интересные закономерности (ещё один вариант перевода pattern)

Data warehouses – хранилища данных

Data marts – витрины данных

Scrapping the data – отказ от данных

Supreme Court decisions – решения Верховного суда

Resolving bottlenecks in production processes – устранение узких мест в производственных процессах

Read the text:

Data mining is simply filtering through large amounts of raw data for useful information that gives businesses a competitive edge. This information is made up of regularities and trends that are already in the data but were previously unseen.

The most popular tool used when mining is artificial intelligence (AI). AI technologies try to work the way the human brain works, by making intelligent guesses, learning by example, and using deductive reasoning. Some of the more popular AI methods used in data mining include neural networks, clustering, and decision trees.

An artificial neural network composes of artificial neurons or nodes. The connections of the biological neuron are modeled as weights. A positive weight reflects an excitatory connection, while negative values mean inhibitory connections. All inputs are modified by a weight and summed. This activity is referred to as a linear combination. Finally, an activation function controls the amplitude of the output. For example, an acceptable range of output is usually between 0 and 1, or it could be −1 and 1. Neural networks may be used for predictive modeling, adaptive control and applications where they can be trained via a dataset. Self-learning resulting from experience can occur within networks, which can derive conclusions from a complex and seemingly unrelated set of information.

Clustering divides data into groups based on similar or limited data ranges. Clusters are used when data isn't labelled in a way that is favorable to mining. For instance, an insurance company that wants to find instances of fraud wouldn't have its records labelled as fraudulent or not fraudulent. But after analyzing patterns within clusters, the mining software can start to figure out the rules that point to which claims are likely to be false.

Decision trees, like clusters, separate the data into subsets and then analyze the subsets to divide them into further subsets, and so on (for a few more levels). The final subsets are then small enough that the mining process can find interesting patterns and relationships within the data.

Once the data to be mined is identified, it should be cleansed. Cleansing data frees it from duplicate information and erroneous data. Next, the data should be stored in a uniform format within relevant categories or fields. Mining tools can work with all types of data storage, from large data warehouses to smaller desktop databases to flat files. Data warehouses and data marts are storage methods that involve archiving large amounts of data in a way that makes it easy to access when necessary.

When the process is complete, the mining software generates a report. An analyst goes over the report to see if further work needs to be done, such as refining parameters, using other data analysis tools to examine the data, or even scrapping the data if it's unusable. If no further work is required, the report proceeds to the decision makers for appropriate action.

The power of data mining is being used for many purposes, such as analyzing Supreme Court decisions, discovering patterns in health care, pulling stories about competitors from newswires, resolving bottlenecks in production processes, and analyzing sequences in the human genetic makeup. There really is no limit to the type of business or area of study where data mining can be beneficial.

Find the answers to these questions in the text.

1 What tool is often used in Data Mining?

2 What AI methods are used for processes?

3 What can you tell us about neural networks?

4 When are clusters used in Data Mining?

5 What do decision trees do?

6 What is the purpose of using data warehouses?

7 What can an analyst do to improve the mining results?

8 Name some of the ways in which data mining is currently used?



Поделиться:




Поиск по сайту

©2015-2024 poisk-ru.ru
Все права принадлежать их авторам. Данный сайт не претендует на авторства, а предоставляет бесплатное использование.
Дата создания страницы: 2021-12-08 Нарушение авторских прав и Нарушение персональных данных


Поиск по сайту: