Theory: structure of high-load service?

0 like 0 dislike
16 views
I would like habroloma to know what my opinions are wrong. So, here goes-S.

Goal: build the service, with the possibility of horizontal scaling, which in the future would potentially be highly loaded.

What are my thoughts on the subject, the questions for each item in it:

— a domain (the name taken from the ceiling) hls.com

— from the Registrar, this domain determines the maximum number of DNS servers (6?), who own and scattered all over the world (does that make sense?)

— DNS zone contains the maximum number of A and AAAA records (32?) in order to get the DNS round-robin.

— At each address specified in DNS, hanging load-balancer (hardware or software? as load-balancer determines which server is the issue, as it defines the least loaded server?)

— Each load-balancer manages a certain amount of ngnix servers (or some other software, if so, what? as ngnix can choose the less loaded server?)

— ngnix server runs a certain number of web servers that actually provide content.

— Each web server is on the machine Apache HTTP, PHP, or Ruby and a local memcached (local or not worth it?)

For web servers there are 2 types of databases — where stored relationships between the objects and the actual objects themselves. All of them must be able to scale horizontally.

As the distributed object store using something like memcacheDB or BigTable (or some other? i.e. each object has a unique key that carries not only the ID of the object itself but also information about the type of the object)

As distributed storage connections need to use some kind of database on the basis of the graph (is that right? if so, what kind?)

— There is also 2 sets of memcached servers which cacheroot requests to both types of database.


Habradi think if I was in the right direction? I have not considered? Where to read? Who already were doing? Help brighter in this.
by | 16 views

7 Answers

0 like 0 dislike
In my case the project was written "ABA-like". More precisely, quite correctly, but without any thoughts about what users will be many and will have to scale. More or less nice code, a lot of tables that are linked with each-other, that is almost ten JOINS s. Caching is not used at all.
\r
Everything worked (and works) on 3 servers: database, PostgreSQL, nginx for static, nginx with gunicorn for the application.
\r
The first two years of this sufficed, but the increasing amount of users and features, in the end, it is necessary to periodically sit down and rewrite the pieces of code: renormalizability database to avoid JOINS s and searches in additional lookup tables, trying to stick a caching (the biggest headache caching is necessary to provide at the very beginning and very very well thought out), etc, etc.
\r
Just describe your experience. I think the moral of this is not necessary initially to perellonet. We have to think about the performance, but not to fanaticism. Most likely, at first, enough simple code and one or two servers. It is unlikely that you immediately get a second Facebook in popularity. On the contrary, those who think that their project immediately take over the world, often wrong.
by
0 like 0 dislike
You have in your question says "theory", and next is the presentation of some practical facts, and very vaguely. As has here been said above, you have fundamentally not the right approach.
\r
Each architectural solution depends on the specific task. For this there are system architects, whose task is a painstaking task analysis of the project and the selection of specific technical solutions in a particular case. In large heavily loaded and constantly developing projects these people have to work on a permanent basis, receive a salary.
\r
No you can't help in this case for two reasons:
1) You did not disclose all the technical details and the details of your project. About the pictures, social. network, and so on — this is not enough, you need a multi-page detailed explanatory description of all the required functions, at least... I'm not saying that it would be good to concretize and resources, as well as to estimate the load.
2) It is not done so here on the knee. Sensible detailed analysis can take several months, and of course for free this one will not do. There are some theoretical bases, but they are so theoretical that you do not even set out above. The number of DNS servers, AA records, nginx-si, php, device, database, etc. — it has the practical field, which strongly depends on the task. You can implement all that you have written, and get the cumbersome unwieldy poorly scalable application that requires huge expenses. Based on what you wrote, I can only advise not to do this, because you originally had the wrong approach and wrong ideas. And any practical tips that you wrote here, or even write — no more than a personal unsubstantiated experience in solving their own (not your) problems that can be radically different.
\r
Can only share advice on how do I when choosing a specific technical solution in steps:
1) requirements Gathering. It is important to collect and identify as many requirements as possible by a specific task in relation to a specific issue. For example, all requirements for data storage such a service.
2) Select the largest possible number of options with which the task is actually doable, and then exclude those that obviously do not fit the requirements, leaving only those that satisfy them the most (it so happens that all the requirements in principle impossible to satisfy).
3) the Technical solution is always a compromise. Of the remaining options it is necessary to choose the most appropriate, often need to do a comparison test (and it is their own, on the tests one way or another simulating your task). If you are still not satisfied, perhaps you should reconsider the requirements or to break the task into several, if possible. In any case, it refers you to the correct item 1.
\r
Bonus track 1: KISS
Bonus track 2: One size never fits all
by
0 like 0 dislike
You're overthinking... You for long enough well-designed database structure and the regular cache.
by
0 like 0 dislike
not the place to start. Start with the application architecture. Let me remind you that of the three: high availability, data consistency and performance, you can only choose two.
by
0 like 0 dislike
The rest of your arguments, not concerning databases also raise a lot of questions. First, if you already have a hardware load balancer, then why he was still with nginx for the same purpose? Why is the conglomeration of http servers? The traffic is allowed to pass through the tree from the web server speed not only adds not, and Vice versa. Why nginx does not balance directly from the application server? Why do you need Apache? You don't sell hosting, I understand, where will play the main charm of Apache and a little extra brake .htaccess-files. All your phrases about caching and about the sets — memcache also does not make any sense without a clear understanding of what to cache, when, and how, on what basis. To cache and sometimes even harmful, and certainly always time-consuming. Resort to it first, if it is really possible, and secondly, when really necessary and to be able to significantly accelerate.
\r
Also you asked how the load balancer will distribute the load, again, you decide based on your problems, on what basis it work, magic in fact there are also no does not happen. Didn't mention anything about the session, how will you handle them if you have any?
\r
A significant role in high-load project plays easy simple support of this large Park, easy configuration of new machines, embedding them in the pool automatic shut-off, and reconfiguration in case of failure of anything, and therefore rapid diagnosis, monitoring. These questions are generally not covered, however, made up a rather complex system. The same vertical scale is not necessarily obviously dead ends and false way, and the heap of projects will be even better.
\r
But he mentioned some kind of database graphs, I never heard that they had arbitrarily wide usage in web projects, including high-load. You are going to use and scalable? Also raises a lot of questions.
by
0 like 0 dislike
If you initially are not properly organizirati architecture in the future will shovel a bunch of crap. Usually, you create a project just to work, and the appearance of the load scale and eliminate bottlenecks.
by
0 like 0 dislike
I advise you not to fool his head, and to develop something that will work. Even a single server can withstand a lot. Then, when the time comes, will put SQL on a separate server, then put the SQL cluster, put c nginx load-balance and so on, reflect in General ;)
by

Related questions

0 like 0 dislike
1 answer
0 like 0 dislike
1 answer
asked Mar 23, 2019 by outself
0 like 0 dislike
1 answer
0 like 0 dislike
6 answers
110,608 questions
257,186 answers
0 comments
28,882 users