博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
多少线程太多? [关闭]
阅读量:2289 次
发布时间:2019-05-09

本文共 9256 字,大约阅读时间需要 30 分钟。

本文翻译自:

Closed .
已关闭
This question is .
这个问题是 。
It is not currently accepting answers.
它当前不接受答案。

Want to improve this question? 想改善这个问题吗? Update the question so it can be answered with facts and citations by . 更新问题,以便通过以事实和引用的形式回答。

Closed 2 years ago . 2年前关闭。

I am writing a server, and I branch each action of into a thread when the request is incoming. 我正在编写服务器,并且在收到请求时将每个操作分支到一个线程中。 I do this because almost every request makes database query. 我这样做是因为几乎每个请求都会进行数据库查询。 I am using a threadpool library to cut down on construction/destruction of threads. 我正在使用线程池库来减少线程的构造/销毁。

My question is though - what is a good cutoff point for I/O threads like these? 但是我的问题是-这样的I / O线程的最佳切入点是什么? I know it would just be a rough estimate, but are we talking hundreds? 我知道这只是一个粗略的估计,但是我们正在讨论数百个吗? thousands? 几千?


EDIT: 编辑:

Thank you all for your responses, it seems like I am just going to have to test it to find out my thread count ceiling. 谢谢大家的答复,看来我只是必须对其进行测试才能确定我的线程数上限。 The question is though: how do I know I've hit that ceiling? 但问题是:我怎么知道我达到了那个上限? What exactly should I measure? 我到底应该测量什么?


#1楼

参考:


#2楼

One thing you should keep in mind is that python (at least the C based version) uses what's called a that can have a huge impact on performance on mult-core machines. 您应该牢记的一件事是,python(至少是基于C的版本)使用了所谓的 ,该可能会对多核计算机的性能产生巨大影响。

If you really need the most out of multithreaded python, you might want to consider using Jython or something. 如果您真的最需要多线程python,则可以考虑使用Jython或其他工具。


#3楼

This question has been discussed quite thoroughly and I didn't get a chance to read all the responses. 这个问题已经进行了非常彻底的讨论,我没有机会阅读所有答案。 But here's few things to take into consideration while looking at the upper limit on number of simultaneous threads that can co-exist peacefully in a given system. 但是,在考虑可以在给定系统中和平共存的并发线程数的上限时,需要考虑以下几件事。

  1. Thread Stack Size : In Linux the default thread stack size is 8MB (you can use ulimit -a to find it out). 线程堆栈大小:在Linux中,默认线程堆栈大小为8MB(您可以使用ulimit -a找出来)。
  2. Max Virtual memory that a given OS variant supports. 给定的操作系统变体支持的最大虚拟内存。 Linux Kernel 2.4 supports a memory address space of 2 GB. Linux Kernel 2.4支持2 GB的内存地址空间。 with Kernel 2.6 , I a bit bigger (3GB ) 使用Kernel 2.6时,我要大一点(3GB)
  3. [1] shows the calculations for the max number of threads per given Max VM Supported. [1]显示了每个给定的“支持的最大VM”的最大线程数的计算。 For 2.4 it turns out to be about 255 threads. 对于2.4,结果约为255个线程。 for 2.6 the number is a bit larger. 对于2.6,这个数字要大一些。
  4. What kindda kernel scheduler you have . 您拥有什么样的内核调度程序。 Comparing Linux 2.4 kernel scheduler with 2.6 , the later gives you a O(1) scheduling with no dependence upon number of tasks existing in a system while first one is more of a O(n). 将Linux 2.4内核调度程序与2.6进行比较,后者提供的O(1)调度与系统中存在的任务数量无关,而第一个任务更多的是O(n)。 So also the SMP Capabilities of the kernel schedule also play a good role in max number of sustainable threads in a system. 因此,内核调度的SMP功能在系统中最大数量的可持续线程中也起着很好的作用。

Now you can tune your stack size to incorporate more threads but then you have to take into account the overheads of thread management(creation/destruction and scheduling). 现在,您可以调整堆栈大小以合并更多线程,但随后必须考虑线程管理(创建/销毁和调度)的开销。 You can enforce CPU Affinity to a given process as well as to a given thread to tie them down to specific CPUs to avoid thread migration overheads between the CPUs and avoid cold cash issues. 您可以将CPU Affinity强制应用于给定的进程以及给定的线程,以将它们绑定到特定的CPU,以避免CPU之间的线程迁移开销并避免冷钱问题。

Note that one can create thousands of threads at his/her wish , but when Linux runs out of VM it just randomly starts killing processes (thus threads). 请注意,一个人可以随意创建数千个线程,但是当Linux用完VM时,它只是随机地开始杀死进程(因此线程)。 This is to keep the utility profile from being maxed out. 这是为了防止实用程序配置文件被最大化。 (The utility function tells about system wide utility for a given amount of resources. With a constant resources in this case CPU Cycles and Memory, the utility curve flattens out with more and more number of tasks ). (效用函数说明给定资源量的系统范围内的效用。在这种情况下,如果使用恒定的资源CPU周期和内存,则效用曲线会随着越来越多的任务而趋于平坦)。

I am sure windows kernel scheduler also does something of this sort to deal with over utilization of the resources 我确定Windows内核调度程序还会执行此类操作来处理资源的过度利用

[1] [1]


#4楼

我经常听到与CPU内核一样多的线程。


#5楼

Some people would say that two threads is too many - I'm not quite in that camp :-) 有人会说两个线程太多了-我不在那个阵营里:-)

Here's my advice: measure, don't guess. 这是我的建议: 衡量,不要猜测。 One suggestion is to make it configurable and initially set it to 100, then release your software to the wild and monitor what happens. 一种建议是使其可配置并将其初始设置为100,然后将您的软件发布并监视发生的情况。

If your thread usage peaks at 3, then 100 is too much. 如果您的线程使用量峰值达到3,则100太多了。 If it remains at 100 for most of the day, bump it up to 200 and see what happens. 如果一天中的大部分时间都保持在100,请将其提高到200,然后看看会发生什么。

You could actually have your code itself monitor usage and adjust the configuration for the next time it starts but that's probably overkill. 实际上,您可以让代码本身监视使用情况,并在下次启动时调整配置,但这可能太过分了。


For clarification and elaboration: 为了澄清和阐述:

I'm not advocating rolling your own thread pooling subsystem, by all means use the one you have. 我不主张滚动自己的线程池子系统,请务必使用现有的线程池子系统。 But, since you were asking about a good cut-off point for threads, I assume your thread pool implementation has the ability to limit the maximum number of threads created (which is a good thing). 但是,由于您询问的是线程的一个好截止点,因此我假设您的线程池实现可以限制创建的最大线程数(这是一件好事)。

I've written thread and database connection pooling code and they have the following features (which I believe are essential for performance): 我已经编写了线程和数据库连接池代码,它们具有以下功能(我认为这对于性能至关重要):

  • a minimum number of active threads. 最少活动线程数。
  • a maximum number of threads. 最大线程数。
  • shutting down threads that haven't been used for a while. 关闭一段时间未使用的线程。

The first sets a baseline for minimum performance in terms of the thread pool client (this number of threads is always available for use). 第一个为线程池客户端设置了最低性能的基准(此数量的线程始终可用)。 The second sets a restriction on resource usage by active threads. 第二个方法设置了活动线程对资源使用的限制。 The third returns you to the baseline in quiet times so as to minimise resource use. 第三个使您在安静的时候回到基线,以最大程度地减少资源使用。

You need to balance the resource usage of having unused threads (A) against the resource usage of not having enough threads to do the work (B). 您需要在没有使用线程(A)的资源使用与没有足够的线程来完成工作(B)的资源使用之间取得平衡。

(A) is generally memory usage (stacks and so on) since a thread doing no work will not be using much of the CPU. (A)通常是内存使用情况(堆栈等),因为不执行任何操作的线程不会占用大量CPU。 (B) will generally be a delay in the processing of requests as they arrive as you need to wait for a thread to become available. (B)通常会延迟请求的处理时间,因为您需要等待线程变为可用状态。

That's why you measure. 这就是为什么要测量。 As you state, the vast majority of your threads will be waiting for a response from the database so they won't be running. 如您所述,您的绝大多数线程将等待数据库的响应,因此它们将不会运行。 There are two factors that affect how many threads you should allow for. 有两个因素影响应允许的线程数。

The first is the number of DB connections available. 第一个是可用的数据库连接数。 This may be a hard limit unless you can increase it at the DBMS - I'm going to assume your DBMS can take an unlimited number of connections in this case (although you should ideally be measuring that as well). 除非您可以在DBMS上增加它,否则这可能是一个硬性限制-在这种情况下,我将假设您的DBMS可以进行无限数量的连接(尽管理想情况下您也应该进行测量)。

Then, the number of threads you should have depend on your historical use. 然后,您应具有的线程数取决于您的历史使用情况。 The minimum you should have running is the minimum number that you've ever had running + A%, with an absolute minimum of (for example, and make it configurable just like A) 5. 您应该运行的最小值是您曾经运行的最小值+ A%,绝对最小值为(例如,使其与A一样可配置)5。

The maximum number of threads should be your historical maximum + B%. 最大线程数应为您的历史最大值+ B%。

You should also be monitoring for behaviour changes. 您还应该监视行为更改。 If, for some reason, your usage goes to 100% of available for a significant time (so that it would affect the performance of clients), you should bump up the maximum allowed until it's once again B% higher. 如果由于某种原因,您的使用率在相当长的一段时间内达到可用状态的100%(这样会影响客户端的性能),则应提高允许的最大值,直到再次提高B%。


In response to the "what exactly should I measure?" 回应“我应该精确测量什么?” question: 题:

What you should measure specifically is the maximum amount of threads in concurrent use (eg, waiting on a return from the DB call) under load. 您应该具体衡量的是负载下并发使用(例如,等待DB调用返回)的最大线程数。 Then add a safety factor of 10% for example (emphasised, since other posters seem to take my examples as fixed recommendations). 然后添加例如 10%的安全系数(强调,因为其他张贴者似乎以我的示例为固定建议)。

In addition, this should be done in the production environment for tuning. 另外,这应该在生产环境中进行调整。 It's okay to get an estimate beforehand but you never know what production will throw your way (which is why all these things should be configurable at runtime). 可以预先获得估算值,但您永远不知道哪种生产方式会影响您的生产(这就是为什么所有这些东西都应该在运行时可配置的原因)。 This is to catch a situation such as unexpected doubling of the client calls coming in. 这是为了应对即将来临的客户端呼叫意外加倍的情况。


#6楼

One thing to consider is how many cores exist on the machine that will be executing the code. 要考虑的一件事是机器上将要执行代码的内核数。 That represents a hard limit on how many threads can be proceeding at any given time. 这表示在任何给定时间可以处理多少个线程的硬限制。 However, if, as in your case, threads are expected to be frequently waiting for a database to execute a query, you will probably want to tune your threads based on how many concurrent queries the database can process. 但是,如果像您的情况那样,预计线程会频繁地等待数据库执行查询,则您可能希望根据数据库可以处理的并发查询数来调整线程。

转载地址:http://dzjnb.baihongyu.com/

你可能感兴趣的文章
The Best BootStrap Resources
查看>>
监听的IP本地不存在 负载均衡启动报错
查看>>
缓冲(Bufer)和缓存(cache)区别
查看>>
tmpfs文件系统
查看>>
浏览器缓存
查看>>
favicon.ico引起的大量404
查看>>
Nginx缓存服务
查看>>
NFS一些问题
查看>>
利用TCP Wrappers构建sshd访问控制列表
查看>>
DenyHosts
查看>>
Maven构建环境安装
查看>>
SVN检出报错
查看>>
SVN同步版本库
查看>>
网络流量分析工具TCPDUMP
查看>>
系统弱密码检查John
查看>>
用户特权管理
查看>>
Linux软件包管理
查看>>
SUID和SGID可执行文件
查看>>
恢复已删除文件
查看>>
对敏感备份数据加密
查看>>