Regulatory reference. NSI management in information systems. Technology for constructing master data from the company "Intertech"

When it comes to the scale and complexity of certain information systems, such characteristics as the number of jobs and flows of processed documents, and the total volume of databases are usually given. However, recently, the number and size of directories are increasingly mentioned as an integral characteristic. This is not very clear to ignorant people, but to specialists such information speaks volumes. After all, it is reference data (range of goods and products, details of partners, suppliers and clients, description of the organization’s structure, etc.) that are essentially the information core of the enterprise management system, including accounting tasks, resource planning, CAD, etc.; they ensure consistency and consolidation of data, eliminate redundancy of information and optimize the search for the necessary information. In addition, directories combine all other documents of the system - invoices, contracts, orders, etc. - throughout its entire life cycle.

Currently one of the most important issues development information technologies corporate level- data integration. Quite often, it is understood as the ability to work with various data formats from different physical sources (including converting them from one format to another). But such a view is at least superficial. In fact, coordination and correct understanding of information is impossible without its meaningful comprehension using common reference books.

Rice. 1. Organization of work of the centralized reference data service

This problem is relevant all over the world, but its significance is especially great for Russia, and two points can be highlighted here:

Automation of domestic companies, as a rule, developed from the bottom up, through the gradual computerization of individual areas and divisions. In addition to using various software and hardware platforms, they also used local directories, combining which is not an easy task;

On modern stage formation of a market economy in our country there are active processes of mergers, acquisitions, formation of holding structures, etc. In this case, complex problems of unification arise information resources, but often already at the level of complete enterprise management systems.

We have highlighted these two points to highlight the fundamental differences in the situations behind them. In the first case, in general, we can simply talk about errors when creating the system - it was necessary to initially create a unified enterprise reference system, implementing a top-down design methodology. When merging different enterprises, the situation is much more complicated, because we are talking about independent companies.

Rice. 2. Using Ontologic 5.0 technology, you can create a unified master data management system

But such internal problems of organizations are just the tip of the iceberg! In the era of economic globalization and e-business, enterprise information systems must communicate with the information systems of partners, suppliers, and clients. And they must speak in a language that each other understands. Then we could move on to issues of public administration...

To illustrate the significance reference information Let's give just two examples.

1. As you know, about a year ago the Anglo-Russian company TNK-BP was created, which was formed as a result of a series of preliminary mergers of large industrial companies (ONAKO, SIDANKO, TNK). One of the first tasks set by the management of the new company was the organization of a unified corporate directory-classifier of material and technical resources. This had to be done even before identifying other areas for the development of integration solutions and management systems. Moreover, it is the joint normative base was supposed to help form a unified corporate English-Russian mentality (the company employs specialists from Russian TNK and English BP) and a common understanding of doing business.

2. At the end of the Second World War, US President Roosevelt set the task of understanding the causes of problems with the supply of spare parts to the front. Having carried out the necessary research, the Americans came to the conclusion that spare parts were sent to the troops in quantities several times greater than the need for them. At the same time, there was still a shortage of spare parts due to the fact that the same products accumulated in warehouses, but labeled differently and bearing different names. As a result, the president issued a directive to create a unified federal system for cataloging supplies for government needs, and primarily for defense and security needs. Over the past twenty years, the United States has annually invested from 2 to 4 billion dollars in standardization programs alone using modular structures (product analogues) to reduce the range of Department of Defense supplies by approximately three times.

Management of normative and reference information

To denote such reference information in automated enterprise management systems in the West, the term Master Data (master data, master data) is used, and the tasks of managing it are called Master Data Management (MDM). However, in the Russian language the concept of normative reference information (RNI) is now more often used, which appeared in disciplines related to management national economy back in pre-computer times. In this case, the definition of “normative” reflects the fact that the problem of creating corporate-level directories goes far beyond the boundaries of the enterprise itself; it must be solved taking into account industry, state and international standards.

We can give the following definition: master data is a conditionally permanent part of all corporate (institutional) information, in contrast to current information generated directly in the process of the organization’s activities. Master data includes dictionaries, reference books and classifiers, data from which (for example, terms, units of measurement, codes, names of materials, contractors, etc.) are used in the generation of current documents. Thus, when generating an invoice on a computer, the names of materials, units of measurement, the name of the recipient enterprise (counterparty), its details and a number of other fields, as a rule, are selected from directories built into the system, rather than entered manually.

To assess the scale of MDM tasks, the following data can be provided. For large companies in the oil and gas sector, the size of material directories ranges from 100 to 250 thousand items, and for counterparties - from 3 to 12 thousand entries.

It is quite obvious that the issues of creating and maintaining reference data up to date are classified as independent tasks in the enterprise management system as a whole; this is often dealt with by a separate service of the company.

According to experts, in our country the cost of processing one master data record is 2-5 dollars (abroad - 10-20 dollars). Accordingly, the cost of one project for the formation of master data large enterprise can be estimated at 400-1000 thousand dollars (including the cost of software, implementation consulting and support).

The oil and gas industry, as well as a number of state and regional structures, were the first to understand the need to carry out work on master data as an independent part of creating an organization’s management system. Currently, approximately 10-15 large projects on this topic are being implemented in Russia, while analysts note a rapid increase in interest in this work from both the corporate and public sectors. To meet the growing needs of clients, a proven methodology for implementing such projects is needed.

Creation problem corporate system NSI lies precisely in the fact that it does not have a simple solution. It would seem that the most reasonable way is to use a ready-made set of directories (international, state, industry). But the fact is that it will be extremely inconvenient for a specific enterprise to use them (they are too redundant and do not take into account the specifics of the organization), and besides, it is simply impossible to create such a global master data system in full (for more information on this topic, see Dmitry Gulko’s article “How to avoid typical mistakes when building corporate and industry systems of regulatory and reference information", PC Week/RE, N 18/2004, p. 35).

Solving the problem is possible only in the form of creating a specialized system for maintaining master data using appropriate standards, methods and software. In fact, this work should combine the efforts of three parties:

Creators of regulations and standards (both state and industry);

Basic software suppliers;

System integrators and consultants who can implement all this taking into account industry practice, national specifics, etc.

In Soviet times, government agencies were very actively involved in issues regulatory regulation. With the beginning of perestroika, there was a failure in this activity, and only 5-7 years ago government structures again took up this work. Several laws and regulations on this topic have already been adopted, and currently there are several state standard classification systems for products and activities (OKP, OKVED, OKDP, TN VED, ECPS). However, each of them has its own specialized purpose and is not suitable for use in its pure form in industry or corporate systems. Western classification systems cannot be applied in our country due to the significant national specifics of our economy. In general, it should be noted that in order to streamline the situation in the field of corporate master data, more active participation is desirable government agencies, but at the same time not crossing the line of reasonable regulation.

Fig.3. Functional diagram reference data management systems

Master Data Management issues are also in the field of attention of basic software suppliers. At the same time, they approach their solution from different directions. First of all, naturally, these tasks are dealt with by manufacturers of ERP solutions, and the leader here is SAP. Another example is infrastructure integration software developers. Here we should mention IBM Corporation - its recent acquisition of Ascential Software is largely explained by the corporation's intention to strengthen the MDM direction (see PC Week/RE, N 10/2005, p. 12). Finally, something needs to be said about document management system providers (eg Hummingbird). Their presence in the MDM segment is explained, on the one hand, by their experience in solving data integration problems, and on the other, by the need to use intelligent technologies for processing unstructured information to manage reference data.

Regarding system integrators and consulting companies, then MDM issues are dealt with to one degree or another by all companies that carry out large projects to create enterprise management systems. Some of them (Intertech, LANIT, IBS, Unit Space, Katalit) have specialized developments in this area. Next, we will briefly talk about proposals for building corporate reference data systems from the Intertech company, which last years has acquired solid experience in implementing such solutions in companies such as TNK-BP, Tatneft, SIBUR, as well as in various federal departments, departments of the Moscow government, etc. She recently entered into a cooperation agreement in the field of MDM with SAP Corporation (see PC Week/RE, N 13/2005, p. 49).

Technology for constructing master data from the company "Intertech"

The methodology proposed by Intertech involves the creation unified system maintaining reference data, linking all regulatory and reference information of the company's divisions, subsidiaries and partners into the general corporate information space (Fig. 1).

Its implementation requires, first of all, the development and adoption of a set of standards and regulations for maintaining an enterprise’s master data. As a technological basis for constructing reference data systems, an ontological model of classification and coding is used - a formal description of accounting objects, based on the identification of their essential properties (Fig. 2). This approach ensures the accumulation of any amount of consistent information and combines the advantages of hierarchical, facet, adaptive and reference classification systems. In general, this technique makes it possible to standardize the actions of expert specialists when they carry out operations to classify and encode groups (classes) of accounting objects, determine the properties (features) of classes and their values, and build navigation hierarchies. It also includes a description of typical user requests, divided into groups according to the degree of uncertainty and imprecision of wording, and recommendations for support service specialists (experts).

Rice. 4. Main stages of work to create a unified system for maintaining reference data

The actual system for maintaining reference data is implemented in the form of a software and hardware complex (Fig. 3), which includes tools for maintaining directories and classifiers, tools for searching for accounting objects, modules for exchanging information between experts and users, and mechanisms for integration with external applications. Its main functional software subsystems integrated with each other are “user workstation”, “expert workstation” and “administrator workstation”. The system in its standard configuration is based on Microsoft technologies (OS - Windows, Web server - IIS, DBMS - SQL Server), but it also provides the ability to use other software platforms.

The Intertech company has also developed a step-by-step methodology for implementing an enterprise master data system (Fig. 4). The underlying approach is based on a number of basic principles.

The evolutionary development of the system involves a step-by-step transition to modern methods maintaining and supporting corporate reference data. General scheme This approach looks like this: old -> old + new -> new; at intermediate stages, the parallel existence of the old and new systems is allowed.

Adaptability of the reference data system to the specifics and landscapes of existing application systems(including ERP-class systems) and to various systems classification and coding presupposes its ability to integrate with external systems.

Continuity allows us to preserve all the best and valuable things that have been developed over years and decades. This concerns the use of the potential of reference data specialists, the stable functioning of existing application systems, the possibilities of migration and transformation of accumulated information arrays.

Standardization and unification of regulations and methods for using and maintaining corporate master data, classification and coding systems make it possible to ensure the constant relevance and availability of master data throughout the company.

Accounting human factor implies the ability to work in the system for different categories of users, with different skills and degrees of “advancement” in the field of information technology, ergonomic design and “friendliness” of system interfaces.

Rice. 5. Functional model of the process of using and maintaining a unified reference data base

For the effective functioning of a unified system for maintaining reference data, a set of organizational and management solutions must be developed, providing for a clear division of responsibilities and functional responsibilities in accordance with the competencies of the company’s personnel groups (Fig. 5):

Users - company employees who use certain data from the master data base when generating working documents;

Experts - specialists of the reference data group, responsible for generating and changing data in the reference data database;

Profile specialists who are well versed in certain aspects of one or another regulatory and reference information that is within their competence in the main professional activity. They participate in the procedure for agreeing on added or changed data upon the recommendation of a specialist expert from the reference data group;

Technical support specialists are automation and IT service personnel who provide maintenance of system software and hardware.

In general, the implementation of a unified system for maintaining master data allows the customer to solve the following main tasks that help improve the efficiency of the entire enterprise:

Create a centralized master data repository that operates within the company’s unified information space and includes the entire range of material and technical resources and other accounting objects;

Centralize the functions of maintaining reference data based on developed corporate classification and coding standards;

Create unified regulations and technological environment for user access to reference data, maintenance of classifiers and reference books by experts and technical support of the system by administrators;

Use software built into the system that maintains the required level of data security and its constant updating, excluding the storage of duplicate, erroneous or outdated information;

Implement the implementation of classifiers and directories of reference data into existing management, accounting and other systems, allowing to streamline and reduce the costs of maintaining regulatory and reference information;

Promptly provide company management with the information necessary to make effective decisions.

Industry: Energy and housing and communal services

Since Rosatom unites many enterprises and organizations, the creation of industry-wide directories is a necessary condition for centralizing and ensuring transparency of procurement activities and relations with suppliers, as well as for the organization collaboration IT systems of industry enterprises. That is why the project to create a unified industry system of normative and reference information (US NSI) was included in the “Program for the transformation of the financial and economic block and information technologies” of the State Corporation. The EOS NSI system will cover organizations in the engineering and construction blocks nuclear power plants", "operation of nuclear power plants", " life cycle nuclear fuel" Based on the results of an open competition, IBS was involved in the implementation of the project, and a specialized SAP solution was chosen as the platform.

To date, a pilot project to create an Unified System of Scientific Information has been completed. Within its framework, a number of reference books have been created: “Counterparties” (debtors/creditors, legal entities, residents/non-residents), “Material and technical resources” (MTR), “Elements industrial facilities", "Unified Chart of Accounts", a set of all-Russian directories and classifiers. The most popular reference books have become “Counterparties” (currently contains about 70 thousand entries), “MTR” (about 150 thousand entries) and “Unified Chart of Accounts”. Already during the pilot project, 215 enterprises were connected to the EOS NSI using the “Counterparties” directory, and 35 enterprises using the “MTR” directory. In total, more than 4 thousand users work in the system, their access is provided using the portal service.

Work is currently underway to replicate the system. It is expected that by 2012 the total number of EOS NSI users will reach about 10 thousand people. Thus, this project in the field of organizing regulatory and reference information on the SAP platform will become one of the largest in the world and the largest in Europe. The results of the project and the capabilities of the EOS NSI will be used in a number of other IT projects of the Rosatom State Corporation. Among them are the creation of a unified industry procurement system, a unified industry system for integrating corporate applications, a unified industry document flow system, a property asset management system, the introduction of a unified settlement center system, etc.

« Centralization of the processes of maintaining regulatory and reference information and the use of an appropriate information system, firstly, will increase the quality and reliability of the information provided by IT systems. Secondly, it will reduce costs and time for generating consolidated reporting. Thirdly, it minimizes risks due to incomplete or incorrect payment data. This will also make it possible to optimize and ensure transparency of procurement processes and work with suppliers, reduce planning time for procurement and supply of materials and equipment, optimize the process of maintaining master data by organizing a unified management environment", noted Head of the NSI program at Greenatom CJSC Kirill Sukovykh.

In the context of the transition to a digital economy, companies have finally become convinced that data is an asset that is important to properly store, process, analyze, use for making decisions and making forecasts. The efficiency of these processes is ensured by a single repository into which verified quality data must be loaded. The task of consolidating them from different sources involves collating and synchronizing directories in various IT systems. This is why businesses need regulatory reference information (RNI) management systems.

According to TAdviser, the volume of the market for master data management systems is about 1.5 billion rubles at the end of 2017. The demand for these solutions is growing by 20-25% per year - in direct proportion to the growth of business digitalization. The acceleration of dynamics is facilitated by the growth of penetration cloud services in the domestic market (at a level of about 20% per year), as well as the launch of initiatives for informatization of the state and society as part of the implementation of the program.

The general path to digitalization dictates the need for a unified knowledge base about customers, products, etc. For digital initiatives to succeed, data must be effectively managed and must first be brought together to formulate a reliable and accurate “single version of the truth” for everyone. structural divisions. Accordingly, there is a growing demand for data collation and consolidation tools that enable rapid access to information regardless of its source, analysis of patterns and anomalies, and secure distribution of data.

In Focus

Business representatives are becoming more and more demanding of the quality of reference data and its management processes. Questions about the quality of reference data also arise locally, among specialists. The more sharply the volume of data accumulated by organizations increases, the higher the requirements for the performance of information systems. The volumes of reference books are constantly growing. Modern solutions in the field of reference data management are expected to be able to support work with more than 1 billion records.

If 10 years ago, at the end of the 2000s, the tasks of reference data were more often understood as the process of migrating directories as part of the implementation of accounting information systems, then in 2018, businesses approach the tasks of managing reference data more consciously and structurally, with the involvement of functional departments that directly use this information in business processes. Tasks related not only to equipment and materials, but also to contractors and other reference data are clarified.

The current situation requires more high level automation and formalization: everything that can be “hardwired” into a clear automated algorithm must be formalized, because Without strict rules, work with reference data turns into chaos. Also, the involvement of various business departments in NSI projects increases their duration. As a solution, modern means of automating data quality using mechanisms come to the fore,” comments Bair Danilov, head of the reference data department at IBS.

As of 2018, up to 75% of this market comes from consulting and about 25% is occupied by licenses. This situation is due to the fact that in addition to the direct creation of the directory, companies need its integration with other information systems, and for customer directories - with personal data protection systems, Krok notes.

New trends

Among the “hot” global technological trends changing the reference data market, IBS experts note the expansion of the scope of directory management, i.e. management not only of basic data - contractors and materials, but also of a unified chart of accounts, production assets and other necessary reference books for key business processes of the enterprise. Also in focus is the automation of the process of checking reference data, including using machine learning technologies, the development of uniform standards for maintaining counterparties and materials, as well as the creation of digital ecosystems in which manufacturers and buyers can freely exchange transparent information about goods and transactions.

Improvement remains the defining trend. Machine learning technologies allow for higher-quality deduplication in an automated manner. In general, the development noticeably changes previously established approaches to working with reference data - the efficiency of data recognition and correction increases, the ability to use multimedia information is added, make data more visual, etc.

Today, not only economic giants, but also medium-sized companies are showing interest in the quality of reference data. IBS notes an increase in the number of requests for scientific research projects from the pharmaceutical industry, Food Industry, mechanical engineering and Agriculture. This interest is also spurred by import substitution initiatives - it is the introduction of NSI that makes it possible to solve the problem of phased and rational import substitution of foreign solutions.

Top 8 players in the Russian market of reference data management systems

Krok IBS SDI Solution "NCIT "Intertech" TaskData Lanit EAE-Consult Navicon
Revenue from NSI projects 2016RUB 135 millionRUB 99.9 millionRUB 63.3 millionRUB 51.9 million44 million rub.28 million rub.RUB 11.2 million7.5 million rubles.
Revenue dynamics for NSI projects 2016/2015 13% 6% 32% 50% 90% 10% -10% Height
Number of NSI projects 2017 4 7 7 6 completed, 2 in progress 6 5 4
Number of NSI projects 2016 4 in progress, 1 completed 4 7 4 5 3 4
Solutions/platforms usedCroc NSI Suite, Talend Platform for MDM, MDM, Informatica MDM, a number of Oracle systems, as well as the domestic Unidata platformSAP, Ataccama, own development (20%), 1C MDMproprietary development of Semantic MDM. DBMS Microsoft SQL Server, Oracle, PostgreSQL* own development - software platform for managing reference data systems Ontologic (registered in the Register of Russian Software No. 4114 dated December 11, 2017);

2. Mining and metallurgical company - creation of an Automated system for managing regulatory and reference information, development of a materials and equipment classifier, normalization of the materials and equipment directory and the directory of contractors. More than 2000 users. Based on SAP MDM, SAP PI, SAP Portal, SAP BPM.

3. Federal executive authority - Consolidation and cleaning of received information, integration of the solution into the corporate IS. Based on Informatica MDM, Informatica Power Center, Informatica Data Quality, Oracle BPM.

1. United Engine Construction Corporation - “Creation and implementation of a Corporate system for managing normative and reference information on the platform of the Semantic Reference Information Management System”

2. Development of an automated system for managing regulatory and reference information of JSC Kalashnikov Concern on the platform of the Semantic reference data management system.

3. “Development of an automated system “Management of electronic directories of an enterprise” for the needs of PJSC RSC Energia.

1. Design, implementation and commissioning industrial operation corporate management system for scientific research data of the Inter RAO Group;

2. Creation of a Unified System of Regulatory and Reference Information of the State Oil Company of the Azerbaijan Republic (SOCAR);

3. Creation of a unified system for managing regulatory and reference information in the Company CJSC “ABI Product”;

4. Implementation of the master data management system (extended reference books) of PJSC MMC Norilsk Nickel;

5. Normalization of the Unified Directory of Materials and Equipment and mapping in the records of the Unified Nomenclature Directory as part of the project to introduce a unified concept for managing corporate master data of PJSC Polyus;

6. Creation of a methodological and regulatory framework for regulatory and reference information regarding the directory of material and technical resources and normalization of the directory of materials and equipment of Irkutsk Oil Company LLC;

7. Creation of a Unified system for managing regulatory and reference information of basic data of the Power Machines group of companies.

1. Industry center for the development and implementation of information systems (OCRS). The functionality of the first stage of ASOUP-3, which includes an Automated complex for maintaining reference data, has been developed.

2. Federal agency forestry (Rosleskhoz). Creating a control subsystem normative and reference information (PNSI).

3. United Instrument-Building Corporation (UPK). A project to build a model of a reference data management system as part of the implementation of the “Network Integrated Settlement and Information Management System” (SIRIUS) project - a centralized procurement management system for the defense industry.

1. Development of an automated system for maintaining reference data in one of the largest banks in Russia (on the Microsoft platform using the NORMA reference data management system, Oracle database).

2. Development of a master data management system for Gazprombank (on the Microsoft platform using the NORMA master data management system, Microsoft SQL Server DBMS).

KSSS turn 8 - translation of directories to the IBM MDM platform, interfaces to the SAP PI bus, quality control of reference data;

Integration of KSSS with 1C DO - integration of the Counterparties directory with 1C systems in DO;

KSSS-NSI RREM - translation and creation of RREM directories on the IBM MDM platform, interfaces to the SAP PI bus. Oracle DBMS is used for the lower storage layer and for the master data mart.

1. Food Union (consolidation of reporting from several branches and production facilities, the ability to accept management decisions based on a constantly updated data set, implementation in the Microsoft Azure cloud environment).

2. Gazprom Gazenergoset (automation of loading aggregated data from the accounting systems of subsidiaries and dependent companies (SDC) into the corporate data warehouse (CDW) in the central office).

3. Specialized depository “Infinitum Specialized Depository” (optimization business processes in terms of maintaining regulatory and reference information, optimizing the architecture by creating a centralized master data repository, eliminating duplicates and double data entry)

The largest projects by number of directories in 2015-2017 1. Project 1. The volume of reference books is more than 30.

2. Project 2. Volume of reference books: about 20.

3. Project 3. The volume of reference books is more than 200.

1. Project of JSC "UEC". The volume of reference books is more than 20.

Integration of KSSS with 1C for SIP - 31 organizations of the LUKOIL Group;

KSSS-NSI RREM - PJSC LUKOIL and 4 NGDO

1. Specialized depository "Infinitum" (about 40,000).

UDC 004.37.01

OH. Zhilyaev ,
Institute of Informatics and
problems of regional management
KBSC RAS, researcher, Nalchik.

Introduction

Creation of a unified information space is a necessary condition effective management various objects, be it an enterprise, department, region or state. The formation of a unified environment presupposes integration management processes, accompanied by the normalization of information flows. Often, the movement of information at different levels and parts of the control object is supported by various information and accounting systems. Accordingly, there is a need to integrate these systems. The growing processes of globalization of the world economy are, in essence, integration processes. Such integration tasks are especially relevant for Russia in connection with its upcoming entry into the World War II Trade Organization(WTO).

The task of integrating information and accounting systems consists of two interrelated parts: data integration and subsequent application integration. When performing data integration, it is necessary to unify and standardize normative and reference information (RNI). .

Master data is a conditionally permanent part of all information in an information system (IS), in contrast to current information generated directly in the process of working in the IS. The master data includes: directories, dictionaries, linear and hierarchical lists, classifiers, registers, codifiers, data from which is used in the generation of current documents.

To denote such reference information in the English-language literature, the term Master Data (master data, master data) is used, and the tasks of managing it are called Master Data Management (MDM). However, in Russian the concept of normative reference information (RNI) is now more often used ), which appeared in disciplines related to economic management even in pre-computer times. In this case, the definition of “normative” reflects the fact that the problem of creating directories must be solved taking into account industry, state and international standards.

If today such terms as, for example, ACS (Automated Control Systems) or IS (Information Systems) have become familiar, then the abbreviation “SU NSI” (Regulatory Reference Information Management System) often causes confusion. Even the meaning that lies behind its decoding is often understood only by specialists. NSI is not just a database, but a complexly organized system with many cross-references between individual directories and classifiers. The mechanism for maintaining the relevance of reference information is especially important. The requirements for the completeness, accuracy and relevance of information in the reference data system are much stricter than in a conventional database, since during the operation of any information system, including automated control systems, the information content of applied tasks depends on the reference data data. Master data is the “foundation” of the entire information system and management of this system should be centralized. In Figure 1, reference data is shown as the lower level, the “information foundation” of the entire IS structure.

Rice. 1 Information system levels

It is the centralized management of the reference data system, subject to unified regulations and provided by a unified technological environment, that allows maintaining the unification of data, completeness, integrity and relevance of all reference books and classifiers included in its composition. Therefore, to have an effectively working IS that solves real problems.

Development of a complete software for the management of reference data began only a few years ago. Leading software manufacturers have recently been paying more and more attention to master data management tools (in the English version MDM, Master Data Management - master data management).

It is difficult to imagine solving integration problems without centralized management of reference data. The problem of master data management arises even in such automated and information-supported structures as banks or Insurance companies. Master data management systems allow not only to accumulate data from several integrated banking systems, for example, to generate reports for several accounting systems; but also to solve problems operational management NSI.

In Russia there is no single center for the formation of reference data similar to GOSTs. And, although new laws related to the development and circulation of electronic technical documents have recently come into force, they have not yet had a noticeable impact on the situation.

The role of NSI in the informatization of the region

Regional informatization plays an important role in the implementation of the development strategy for the information technology sector in our country. Recently in the regions Russian Federation Work has intensified on the use of information technologies in all spheres of life in the regions. This was facilitated by federal authorities state authorities of a number of events and adoption regulatory documents in the field of use of information technologies at the federal, departmental, regional and municipal levels. One of such documents, designed to help solve problems of comprehensive informatization of the region, is the Decree of the Government of the Russian Federation “On the procedure for the formation and use of basic classifiers, reference books and registers in the provision of government and municipal services V electronic form" dated 08/31/2010

A special role for reference data is also assigned to informatization programs for industries and departments. For example, in published on March 31, 2010. The draft Concept of Informatization of Healthcare especially emphasizes that information systems in healthcare should be designed taking into account standards and regulations and be based on a single master data. (The composition of the reference data used in the healthcare sector includes social development And labor relations The Russian Federation includes a total of 163 different classifiers and reference books)..

At the regional level, the goal of implementing the reference data infrastructure in automated management systems is the creation of a unified system of reference books and classifiers used in state (municipal) information systems subject of the Russian Federation, as well as the formation of basic accounting registers that ensure the collection and storage of provided information on the main objects of regional management. The master data management system, being a centralized repository and sole supplier a common reference data for all infrastructure and departmental information systems of the region should ensure information compatibility of local information systems and “electronic government” applications of the subject.

Obviously, the next step in the development of information technology in the Russian Federation should be the subsequent integration of departmental, regional and municipal information systems at the federal level. This task of integrating government information systems is so complex that in addition to standardizing documents (for example, based on XML) and integration infrastructure in the form of software, routing XML documents, government efforts are also needed in the field of standardizing data descriptions.

An example of an initiative in this area is the e-GMS (UK GoverNmeNt Metadata StaNdard) standard adopted in the UK. . Many countries have taken as a basis the so-called “Dublin Core”, which includes 15 elements of information description:

  • title;
  • author or creator;
  • topic and keywords;
  • description;
  • publisher;
  • other contributors;
  • date of;
  • resource type;
  • format;
  • resource identifier;
  • source;
  • language;
  • communications;
  • area (coverage);
  • rights management.

In addition to the elements themselves, the “Dublin Core” has so-called clarifications of elements, for example: “Date of creation”, “Date of publication”, “Expiration date”, etc. Countries can not only use this core, but also add any additional elements they deem necessary. In addition, the first tool when searching for information is usually browsing categories. Therefore, government metadata standards initiatives are defining standards for a list of categories (a primary search tool without the use of keywords).

conclusions

When familiarizing yourself with legislation aimed at regulating the provision of state and municipal services in in electronic format, and the organization of interdepartmental information interaction at the state and municipal levels, can be seen:

  • actual absence in regulatory legal acts mandatory for compliance with the requirements for standardization of information technologies and software used in government information systems necessary to ensure interdepartmental exchange of information;
  • the absence in regulatory legal acts of uniform clear requirements for directories, classifiers and data schemes of information systems used in interdepartmental information exchange;
  • the absence in regulatory legal acts of uniform and mandatory mechanisms for providing information and providing assistance to all federal, regional and municipal authorities public services electronic. .

Today, both in the Russian Federation and abroad, the main difficulty in implementing projects in the field of providing electronic services at the state, regional and municipal levels, as well as similar interdepartmental projects, is in conditions where significant efforts are required to integrate data and applications, consists not in the use of certain specific technologies, but in organizing the process of adopting relevant standards and harmonizing the information technology architectures of various organizations and departments.

Projects in the field of provision of electronic services at the state, regional, municipal and departmental levels, which are carried out by governments of different countries, provide for the following main types of standards:

  • data standards;
  • standards for interdepartmental information exchange;
  • metadata (and information retrieval) standards;
  • safety standards.

A unified modern methodology for maintaining master data is needed, otherwise, as the amount of data increases, the system will become unmanageable.
The regulations and methodology for filling reference books and classifiers must be spelled out in detail, otherwise it will be extremely difficult to ensure high-quality and orderly work of experts in maintaining reference data. There is a need for a clear delineation of the areas of competence and responsibility of users of reference data and experts in its management.

Highly efficient modern technology and reference data management system, problem solving multi-user access to it with the possibility of physical separation of powers, implementing the interaction of users with experts and ensuring easy scaling of the system when increasing both the reference data base and the number of serving experts.

Literature:
1. “Strategy for the development of the information society in the Russian Federation” (approved by the President of the Russian Federation on February 7, 2008 No. Pr-212);
2. Draft resolution of the Government of the Russian Federation "On the procedure for the formation and use of basic classifiers, directories and registers in the provision of state and municipal services in electronic form" dated August 31, 2010.
3. “Review of NSI”, Publication of the Ministry of Economic Development, 2010
4. "The concept of creating an information system in healthcare for the period until 2020", 2010.
5. Polotnyuk I."Metadata as a basis for integration", PC Week/RE (492), 2005.
6. Ray Wang, Rob Karel."Trends 2008: Master Data Management" 2008.

While working on large-scale automation projects and creating new information systems, we were each time faced with the need to implement a subsystem for maintaining directories, classifiers, registers and other similar objects that make up the customer’s reference information (RNI). Over the 15 years of working at LANIT with data management systems, life has given us clients with a wide variety of requirements. And, of course, different situations arose on these projects. I will tell you about several instructive stories that happened to us. In the article you will find examples that will be useful to many who are involved in software development. Well, for those who work directly with the NSI, it will be even more interesting - their own shirt is closer to the body.

Special thanks to the wonderful artist Vasya Lozhkin for the illustrations.

Case one. How to load a wagon and small cart

Creation of a unified counterparty management system for a large production company with many factories throughout the country and abroad.

Objective of the project– create a unified database of counterparties for all divisions. Counterparty management is carried out on the basis of requests, which are assigned priorities from low to urgent. An urgent application must be processed by NSI experts within 2 hours, regardless of the time difference between departments.

Living history
The project was agreed upon with all interested parties (the customer's management convinced us of this) and developed within the given time frame in accordance with the approved requirements.

The presentation of the created counterparty management system went smoothly until one prominent woman stood up - the head of the Siberian branch - and very energetically, using Russian idiomatic expressions, brought to the attention of those gathered that when it comes to her Railway carriage for loading finished products, she will not wait 2 hours while someone there in Moscow considers the application to add a buyer.

She is not going to pay for the downtime of the car while the application is being approved, but will enter the buyer’s data into the system as is and ship the goods, and the Moscow comrades can then deal with the information about the buyer as much as they want.

This statement was supported by several more heads of the company's branches, which almost completely destroyed the centralized methodology of conducting single directory counterparties based on orders.

As a result, the project was modified in such a way that all branches had access to the counterparty database and could make changes to it directly, but at the same time automatic search similar records that were displayed to the branch employee, and he made a decision on the need to adjust the data, which was later checked by an expert group.

What we remembered: do not trust the words of managers and responsible persons on the customer’s side that all decisions have been agreed upon, everything is on topic and there are no objections. Identify all project stakeholders and try to find out system requirements and constraints directly from them.

Case two. We use it as we want

Creation centralized system customer management for an insurance company with a large number of branches and agents throughout the country.

Objective of the project– creation of a consolidated client base for use in analytical applications. The database was collected from all branches, the data was verified, supplemented, and duplicate objects were eliminated. The number of clients in one branch ranges from a thousand to several million. At the same time, there is practically no overlap in clients between branches.

Living history

Once a consolidated customer database was created, it had to be periodically compared with branch databases to identify differences, then process them and upload changes to the consolidated database. The growth of the client base between reconciliations amounted to several thousand records.

To perform the reconciliation, a special module was created, the architecture of which was designed based on the fact that it should quickly compare a large number of records and generate a relatively small XML file with changes for download. The XML format was chosen by the customer.

After implementing the system, we received a message from the customer that the reconciliation module works extremely slowly and generates a huge file for loading into the consolidated database, which they cannot open in any way.

What did it turn out to be? The customer carried out the initial loading of data from branches into the consolidated directory. The experts found this work tedious and time-consuming, and they simply took the reconciliation module and fed it with the complete data of the new branch, which had never been loaded into the consolidated directory.

The reconciliation module, which, in accordance with the technical specifications, was supposed to generate information about differences in the number of several thousand records, received two million records as input, and all of them were missing from the consolidated directory.

As a result, after several hours of superhuman effort, the reconciliation module nevertheless generated a file for downloading, which included all the branch data. And, yes, this file was huge.

The reconciliation module was not used by the customer for its intended purpose, but the customer liked the very fact that reconciliation allows for initial data loading, and he was going to continue working in this way, only he asked to significantly speed up the work of the module and do something with the created file so that it could be opened in a text editor.


In response to our objections that the reconciliation module is not intended for initial data loading, the customer happily showed the technical specifications and asked, where is it written here? We use it as we want!
As a result, we had to make changes to the architecture of the reconciliation module in order to process large amounts of data and generate an output file in CSV format, since the customer absolutely did not want to give up such a convenient tool.

What we remembered: Always include a description of the limitations in the specification - what your system should not do. Well, or create solutions that take into account all possible use cases, which is much more expensive.

Case three. Not a baby elephant, but an elephant, and it has to fly too

Creation of a centralized system for maintaining master data for a financial organization.

Objective of the project- creation of a centralized system for maintaining directories and classifiers with distribution of changes to interested systems and databases. Providing access to external systems to directories through the web services of our system.

Typically, customers have an average number of entries per directory from several hundred to several thousand. Our recent record holder is a directory that had 11 million entries. But this customer gave us a surprise. His directory contained over 100 million entries. We downloaded it for more than a day, because... During initial loading, many data checks were performed. This would not have been a big problem, but the customer demanded that the directory download in a few minutes.

As a result, we had to greatly change the way the system works with this reference book. In fact, it is maintained outside the system, and we only provide an interface for its use. We are currently developing new ways for our system to work with very large directories. We hope the customer will like it.

What we remembered: In the modern world, there is more and more data, and its growth rate is constantly increasing. The system must be ready for high loads even where they were not initially expected. We are constantly developing our solution taking into account modern trends data growth and increased requirements for the speed of their processing.

Case four. Difficult trick with files

Creation of a centralized system for maintaining master data in a large bank.

Objective of the project– creation of a centralized system for maintaining directories and classifiers with distribution of changes to interested systems and databases. A special feature of the project is the very complex processes of propagating changes that affect many systems.

Since in the future I will have to mention our own solution to manage the NSI, I will allow myself a small lyrical digression.

Read more about the NORMA system.

The tasks of our customers are largely similar, and we decided to reduce software development costs and reduce project time by creating our own universal platform for maintaining master data and master data (Reference Data Management & Master Data Management). The system has existed for more than 10 years, and all these years we at LANIT have been actively developing it.

NORMA supports centralized and distributed reference data management. All data and meta-information are maintained taking into account the history of changes, and the system allows you to view and change the entire array of reference data for an arbitrary date in the past or future. Processes for coordination and approval of changes can be configured for directories. The system includes a dedicated change distribution server, which allows you to interact with external systems through various interfaces and create fairly complex integration business processes (a sort of mini BizTalk Server). We have data export/import packages that can upload/load directory data into databases and files of various formats. Maintaining conversion tables for external systems is supported.

NORMA includes a graphical query builder and report designer. In addition to working with its own directories, the system allows, through its interface, to view and change directories that are located in databases external to it, as well as use these directories in the query builder and export/import packages.

In response to the occurrence of various events in the system, for example, events of changes to the directory, plug-in software components written in C# can be launched, which can both check data and interact with external systems and, in fact, the NORMA system itself. Almost all system functions are available through web services.

The system can be scaled both vertically by increasing the power of the application server and database, and horizontally by using a multi-node application server, in which each node or group of nodes is responsible for performing a separate function. To store reference data, the system can use Microsoft SQL Server, Oracle or PostgreSQL.


Typically, when creating references and change propagation processes, the customer consults with our analysts about which tool or set of tools provided by the system is best to use for a particular task. This time the customer said that he would create directories and processes independently.

After some time, one of the customer’s specialists contacted us with a complaint that his data was not being loaded into the system. As confirmation, we were sent a data import package, a source file with the records being loaded, and an error message stating that the data being loaded was of the wrong type.

Let's start to figure it out. We twist the package this way and that, try it different variants presentation of the original data, but we cannot repeat the mistake. We contact the customer with questions: maybe the import package has connected software components, maybe some additional restrictions are imposed on the directory, maybe the data is not from this process? We get the answer to everything - there is nothing like that, everything should load easily and worked before.


It turns out that this import package was just the tip of the iceberg. Briefly and greatly simplified, the following happened. The import procedure loaded the correct data from source file to the directory. The original file was deleted. Our system then propagated the changes to multiple databases, one of which compared its own data with our changes and generated a discrepancy file that was returned to our system for download. Moreover, to download this file, the customer used the same import procedure as for the source file. And this particular file, generated by the external system, contained data of the wrong type. Obviously, when analyzing the original file, we could not find any errors, and we were not told anything about the second file and the sprawling process of distributing changes.

What we remembered: Always check the information you receive, even if they tell you that we have a little problem here, and it’s in this very place, I swear to my mother! Analyze the problem in context.

Case five. I'm getting used to the inconsistencies

Creation of a master data management system in a manufacturing company.

Objective of the project– creation of a system for maintaining reference data in management company with many branches, factories and design departments.

This time we did not progress beyond a few presentations. The techies really liked our NORMA system. She covered all their existing problems. Then it was the turn to show the system to management, and here the bummer of the decade happened. High leader looked, listened and said: “We all work here on Apple products, they have a certain style, but your system does not fit into this style. We won’t even consider it.”


What we remembered: Customers are different, and for some you are simply not suitable. The style is different.

Similar stories happen in various projects. What was interesting in your project life? What was an unexpected lesson for you? Share in the comments.

Tags: Add tags