Coders Gun: Teradata

Welcome to the world of Teradata

The first question that must be arising in your mind is Why Teradata?

You will find answers to all such basics questions in this blog in a very simple way. I assure you once you will go through all my blogs related to Teradata, you will have perfection in Teradata.

Let's suppose, you are CEO of a company. You want to analyze the trend of your company's business(Product Sales, Orders, Profit\Loss, Revenue) over the past 2 years.

Now you call the top technical team of your company and ask them to generate reports for it. For this the technical team needs current data as well as data of past 2 years(historical data). This huge data will be stored in the data warehouse(very large databases).

Here comes the next that which database to go for performance wise?
And definitely Teradata is the leader here. Its features such as Massive Parallel Processing(MPP), Shared Nothing Architecture, Linear Scalability makes it the king of the current market. I will explain full architecture of Teradata in my coming blogs. But now we will just have a basic idea about these features.

AMP, acronym for "Access Module Processor," is the type of vproc (Virtual Processor) used to manage the database, handle file tasks and and manipulate the disk subsystem in the multi-tasking and possibly parallel-processing environment of the Teradata Database.

Each AMP attached to the Teradata system listens to the Parsing Engine(PE) via the BYNET for instructions. Each AMP is connected to its own disk and has the privilege to read or write the data to its disk. The AMP can be best considered as the computer processor with its own disk attached to it. Each AMP is allowed to read and write in its own disk ONLY. This is known as the ‘SHARED NOTHING ARCHITECTURE’. Teradata spreads the rows of the table evenly across all the AMP's, when PE asks for data all AMP's work simultaneously and read the records from its own DISK. This is known as Parallelism.

BYNET – The BYNET is the communication channel between PE and AMP. It ensures communication between PE and AMP. In Teradata system there are always two BYNET system. They are called as ‘BYNET 0’ and ‘BYNET 1’. But we refer them as a single BYNET system. The reason two BYNET exist on a Teradata system is that –
1) If one BYNET fails, the second BYNET takes over it place.
2) Two BYNET improve the performance of the system, the PE and AMP can talk to each other over both BYNET.

Summary:

MPP (Massively Parallel Processing) - Each AMP has its own disk. Each AMP performs task with its own disk in parallel.

Shared Nothing - No AMP interferes with the task of other AMP's.

Linear Scalability - Increase the number of AMP's linearly as required.

Coders Gun

Sunday, September 11, 2016

Teradata

No comments:

Post a Comment

Blog Archive