Message ID | 20230803075638.2355913-1-Qi.Chen@windriver.com |
---|---|
State | New |
Headers | show |
Series | [bitbake-devel,V4] runqueue.py: fix PSI check logic | expand |
On 2023-08-03 03:56, Qi.Chen@windriver.com wrote: > From: Chen Qi<Qi.Chen@windriver.com> > > The current logic is not correct because if the time interval > between the current check and the last check is very small, the PSI > checker is not likely to block things even if the system is heavy > loaded. Agreed, there is a problem but we do have to be careful to not a make things too slow since the goal is to have maximum throughput without having too much congestion and cpu scheduling and IO latency. > In fact, if the time interval is very small, the result > might be 0, resulting in any small value not having any effect. Yes, the PSI variable are implemented as per cpu vars and you can get the same result several times in a row, even when there is lots of CPU pressure. Eg: $ stress -c 25 -t 100 & [1] 3851144 stress: info: [3851144] dispatching hogs: 25 cpu, 0 io, 0 vm, 0 hdd $ for i in `seq 200`; do date +"%Y-%m-%d %H:%M:%S.%N" >> /tmp/cpu; cat /proc/pressure/cpu >> /tmp/cpu ; done $ less /tmp/cpu See (1) below. I need to look at the PSI averaging algorithm again! > > To reproduce the current problem, set BB_PRESSURE_MAX_CPU/IO/MEMORY > to "1", and start a build. We can see there's more than one tasks > running throughout the whole build process. > > This patch improves things a little bit. It will block task spawning > for at least 1s if the limits are exceeded, this is to overcome the > 0 result issue. On the other hand, if the task spawning is not > blocked, we check the pressure every time. > > However, even with this fix, we still have the task spawning peek issue. > Because at some point, when the load is low enough, a few tasks will > be spawned altogether. > > In summary, this patch improves the situation, but the peek task > spawning peek issue is still there. Richard and I were talking about this and he suggested that we take a different approach of changing from a binary approach to one where if we're close (say withing 10%) of the threshold, we tell the bitbake scheduler: " Things don't look good, can you ask me again in a little while?" Then if the congestion abates, we allow new task and if not, then we've avoided introducing new load as we are ramping up. Richard claims that this "ask again in a while" is something the scheduler already has code for. This is essentially a smoothing algorithm, which I have little experience. I did, as a student decades ago, try to control the temperator of a beaker of water and bring it from room temperature to a predefined goal with minimal overshoot and reverberation! There are people with expertise in this sort of thing, so if you are such a person, it would be good to have you appear out of the Yocto woodwork! I think we're doing something similar to the temperature controlled beaker except it's a not a fixed thermal load but rather a continuously changing one. I'm not sure if I've written this down before but I think there would be some benefit, at least for a self-test, to have a series of stress-ng tasks that we put into a recipe. One would bring the system up to just below CPU (IO) congestion and keep the system in that state for a ~5-10 minutes. Then we'd run another N (= 10?), short-lived (30 s?) stress-ng -c tests all at once. If the pressure regulation were working well, the secondary stress-ng jobs would run serially and take N*Timeout to be done. Of course, on a loaded system such as the YP autobuilders, this would take even longer but should never take < N*Timeout seconds. While this sounds like a good test to me (and kind of fun!), we should remind ourselfs that the larger task is to get make/ninja/cargo/etc to use a job pool per bitbake instance and/or patch them to read PSI info. We'd still have the congestion onset/abatement problem with those tools though so the problem is worth figuring out. For now, I'd need to see the before and after times an buildstats for a build of say core-image-minimal and/or poky world on a system that runs at 100% capacity for a good part of the build without PSI regulation. Data below. Sorry, I don't have time to analyze or graph it. ../Randy > > Signed-off-by: Chen Qi<Qi.Chen@windriver.com> > --- > bitbake/lib/bb/runqueue.py | 33 ++++++++++++++++++--------------- > 1 file changed, 18 insertions(+), 15 deletions(-) > > diff --git a/bitbake/lib/bb/runqueue.py b/bitbake/lib/bb/runqueue.py > index 48788f4aa6..7c482a55f3 100644 > --- a/bitbake/lib/bb/runqueue.py > +++ b/bitbake/lib/bb/runqueue.py > @@ -179,6 +179,7 @@ class RunQueueScheduler(object): > self.prev_memory_pressure = memory_pressure_fds.readline().split()[4].split("=")[1] > self.prev_pressure_time = time.time() > self.check_pressure = True > + self.psi_exceeded = False > except: > bb.note("The /proc/pressure files can't be read. Continuing build without monitoring pressure") > self.check_pressure = False > @@ -191,6 +192,14 @@ class RunQueueScheduler(object): > BB_PRESSURE_MAX_{CPU|IO|MEMORY} are set, return True if above threshold. > """ > if self.check_pressure: > + now = time.time() > + tdiff = now - self.prev_pressure_time > + # If task spawning has been blocked, block it for at least 1s; > + # otherwise, check the pressure ASAP. This is because if tdiff is very small > + # the result of (curr_xxx_pressure - prev_xxx_pressure) might be 0, resulting > + # in any small value not having effect. > + if self.psi_exceeded and tdiff < 1.0: > + return self.psi_exceeded > with open("/proc/pressure/cpu") as cpu_pressure_fds, \ > open("/proc/pressure/io") as io_pressure_fds, \ > open("/proc/pressure/memory") as memory_pressure_fds: > @@ -198,25 +207,19 @@ class RunQueueScheduler(object): > curr_cpu_pressure = cpu_pressure_fds.readline().split()[4].split("=")[1] > curr_io_pressure = io_pressure_fds.readline().split()[4].split("=")[1] > curr_memory_pressure = memory_pressure_fds.readline().split()[4].split("=")[1] > - now = time.time() > - tdiff = now - self.prev_pressure_time > - if tdiff > 1.0: > - exceeds_cpu_pressure = self.rq.max_cpu_pressure and (float(curr_cpu_pressure) - float(self.prev_cpu_pressure)) / tdiff > self.rq.max_cpu_pressure > - exceeds_io_pressure = self.rq.max_io_pressure and (float(curr_io_pressure) - float(self.prev_io_pressure)) / tdiff > self.rq.max_io_pressure > - exceeds_memory_pressure = self.rq.max_memory_pressure and (float(curr_memory_pressure) - float(self.prev_memory_pressure)) / tdiff > self.rq.max_memory_pressure > - self.prev_cpu_pressure = curr_cpu_pressure > - self.prev_io_pressure = curr_io_pressure > - self.prev_memory_pressure = curr_memory_pressure > - self.prev_pressure_time = now > - else: > - exceeds_cpu_pressure = self.rq.max_cpu_pressure and (float(curr_cpu_pressure) - float(self.prev_cpu_pressure)) > self.rq.max_cpu_pressure > - exceeds_io_pressure = self.rq.max_io_pressure and (float(curr_io_pressure) - float(self.prev_io_pressure)) > self.rq.max_io_pressure > - exceeds_memory_pressure = self.rq.max_memory_pressure and (float(curr_memory_pressure) - float(self.prev_memory_pressure)) > self.rq.max_memory_pressure > + exceeds_cpu_pressure = self.rq.max_cpu_pressure and (float(curr_cpu_pressure) - float(self.prev_cpu_pressure)) / tdiff > self.rq.max_cpu_pressure > + exceeds_io_pressure = self.rq.max_io_pressure and (float(curr_io_pressure) - float(self.prev_io_pressure)) / tdiff > self.rq.max_io_pressure > + exceeds_memory_pressure = self.rq.max_memory_pressure and (float(curr_memory_pressure) - float(self.prev_memory_pressure)) / tdiff > self.rq.max_memory_pressure > + self.prev_cpu_pressure = curr_cpu_pressure > + self.prev_io_pressure = curr_io_pressure > + self.prev_memory_pressure = curr_memory_pressure > + self.prev_pressure_time = now > + self.psi_exceeded = exceeds_cpu_pressure or exceeds_io_pressure or exceeds_memory_pressure > pressure_state = (exceeds_cpu_pressure, exceeds_io_pressure, exceeds_memory_pressure) > if hasattr(self, "pressure_state") and pressure_state != self.pressure_state: > bb.note("Pressure status changed to CPU: %s, IO: %s, Mem: %s" % pressure_state) > self.pressure_state = pressure_state > - return (exceeds_cpu_pressure or exceeds_io_pressure or exceeds_memory_pressure) > + return self.psi_exceeded > return False > > def next_buildable_task(self): 1) $ rg "2023|some" /tmp/cpu 1:2023-08-03 21:29:06.026561191 2:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 4:2023-08-03 21:29:06.031553426 5:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 7:2023-08-03 21:29:06.035901497 8:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 10:2023-08-03 21:29:06.040136284 11:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 13:2023-08-03 21:29:06.043794493 14:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 16:2023-08-03 21:29:06.047555871 17:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 19:2023-08-03 21:29:06.051353279 20:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 22:2023-08-03 21:29:06.055225669 23:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 25:2023-08-03 21:29:06.059436263 26:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 28:2023-08-03 21:29:06.063509822 29:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 31:2023-08-03 21:29:06.067917089 32:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 34:2023-08-03 21:29:06.071907717 35:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 37:2023-08-03 21:29:06.075865773 38:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 40:2023-08-03 21:29:06.079846668 41:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 43:2023-08-03 21:29:06.083961331 44:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 46:2023-08-03 21:29:06.087959027 47:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 49:2023-08-03 21:29:06.091544025 50:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 52:2023-08-03 21:29:06.095441172 53:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 55:2023-08-03 21:29:06.099397076 56:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 58:2023-08-03 21:29:06.103036659 59:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 61:2023-08-03 21:29:06.106987290 62:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 64:2023-08-03 21:29:06.110500839 65:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 67:2023-08-03 21:29:06.113993363 68:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 70:2023-08-03 21:29:06.117569797 71:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 73:2023-08-03 21:29:06.121977963 74:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 76:2023-08-03 21:29:06.125929934 77:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 79:2023-08-03 21:29:06.130064807 80:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 82:2023-08-03 21:29:06.134026786 83:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 85:2023-08-03 21:29:06.137735375 86:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 88:2023-08-03 21:29:06.141781398 89:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 91:2023-08-03 21:29:06.145287007 92:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 94:2023-08-03 21:29:06.149168307 95:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 97:2023-08-03 21:29:06.152818060 98:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 100:2023-08-03 21:29:06.156414605 101:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 103:2023-08-03 21:29:06.160560201 104:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 106:2023-08-03 21:29:06.164730602 107:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 109:2023-08-03 21:29:06.168689146 110:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 112:2023-08-03 21:29:06.172659034 113:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 115:2023-08-03 21:29:06.176217293 116:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 118:2023-08-03 21:29:06.179786393 119:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 121:2023-08-03 21:29:06.183386966 122:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 124:2023-08-03 21:29:06.186901117 125:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 127:2023-08-03 21:29:06.191092359 128:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 130:2023-08-03 21:29:06.194689321 131:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 133:2023-08-03 21:29:06.198680598 134:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 136:2023-08-03 21:29:06.202466314 137:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 139:2023-08-03 21:29:06.206159217 140:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 142:2023-08-03 21:29:06.210732253 143:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 145:2023-08-03 21:29:06.214495861 146:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 148:2023-08-03 21:29:06.218378986 149:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 151:2023-08-03 21:29:06.222715102 152:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 154:2023-08-03 21:29:06.226811631 155:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 157:2023-08-03 21:29:06.231578359 158:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 160:2023-08-03 21:29:06.235715687 161:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 163:2023-08-03 21:29:06.239569685 164:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 166:2023-08-03 21:29:06.243630675 167:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 169:2023-08-03 21:29:06.247567397 170:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 172:2023-08-03 21:29:06.251397908 173:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 175:2023-08-03 21:29:06.255011706 176:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 178:2023-08-03 21:29:06.258932029 179:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 181:2023-08-03 21:29:06.262697913 182:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 184:2023-08-03 21:29:06.266799173 185:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 187:2023-08-03 21:29:06.270925022 188:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 190:2023-08-03 21:29:06.275043244 191:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 193:2023-08-03 21:29:06.278632334 194:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 196:2023-08-03 21:29:06.282248451 197:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 199:2023-08-03 21:29:06.285993782 200:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 202:2023-08-03 21:29:06.290307906 203:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 205:2023-08-03 21:29:06.294324211 206:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 208:2023-08-03 21:29:06.298434915 209:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 211:2023-08-03 21:29:06.302588527 212:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 214:2023-08-03 21:29:06.306381068 215:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 217:2023-08-03 21:29:06.310380335 218:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 220:2023-08-03 21:29:06.315066360 221:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 223:2023-08-03 21:29:06.319399493 224:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 226:2023-08-03 21:29:06.323533769 227:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 229:2023-08-03 21:29:06.327623574 230:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 232:2023-08-03 21:29:06.331404094 233:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 235:2023-08-03 21:29:06.335282298 236:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 238:2023-08-03 21:29:06.339218007 239:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 241:2023-08-03 21:29:06.343022226 242:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 244:2023-08-03 21:29:06.346892658 245:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 247:2023-08-03 21:29:06.350737213 248:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 250:2023-08-03 21:29:06.354365299 251:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 253:2023-08-03 21:29:06.358504205 254:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 256:2023-08-03 21:29:06.362474648 257:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 259:2023-08-03 21:29:06.366500614 260:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 262:2023-08-03 21:29:06.370409578 263:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 265:2023-08-03 21:29:06.374728115 266:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 268:2023-08-03 21:29:06.379177687 269:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 271:2023-08-03 21:29:06.382958096 272:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 274:2023-08-03 21:29:06.386287192 275:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 277:2023-08-03 21:29:06.390148390 278:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 280:2023-08-03 21:29:06.393995408 281:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 283:2023-08-03 21:29:06.397538386 284:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 286:2023-08-03 21:29:06.401522567 287:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 289:2023-08-03 21:29:06.405142382 290:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 292:2023-08-03 21:29:06.408883272 293:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 295:2023-08-03 21:29:06.412675078 296:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 298:2023-08-03 21:29:06.416350916 299:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 301:2023-08-03 21:30:02.078140466 302:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696853256 304:2023-08-03 21:30:02.107326758 305:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696855634 307:2023-08-03 21:30:02.126028647 308:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696856111 310:2023-08-03 21:30:02.146028343 311:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696858681 313:2023-08-03 21:30:02.170021779 314:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696861693 316:2023-08-03 21:30:02.210054311 317:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696865034 319:2023-08-03 21:30:02.246006375 320:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696867390 322:2023-08-03 21:30:02.274025123 323:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696870090 325:2023-08-03 21:30:02.306023178 326:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696873121 328:2023-08-03 21:30:02.342042024 329:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696876150 331:2023-08-03 21:30:02.374045594 332:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696878182 334:2023-08-03 21:30:02.398024941 335:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696880542 337:2023-08-03 21:30:02.430025787 338:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696884229 340:2023-08-03 21:30:02.474025766 341:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696886618 343:2023-08-03 21:30:02.506036609 344:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696889650 346:2023-08-03 21:30:02.538005211 347:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696892350 349:2023-08-03 21:30:02.570002594 350:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696895386 352:2023-08-03 21:30:02.602024574 353:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696897412 355:2023-08-03 21:30:02.626006000 356:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696899777 358:2023-08-03 21:30:02.654031808 359:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696901808 361:2023-08-03 21:30:02.686002144 362:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696904511 364:2023-08-03 21:30:02.718074747 365:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696908199 367:2023-08-03 21:30:02.754015511 368:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696910921 370:2023-08-03 21:30:02.786022272 371:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696913625 373:2023-08-03 21:30:02.818041167 374:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696916321 376:2023-08-03 21:30:02.858023387 377:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696919750 379:2023-08-03 21:30:02.894034643 380:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696922454 382:2023-08-03 21:30:02.926147057 383:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696925319 385:2023-08-03 21:30:02.954068342 386:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696927350 388:2023-08-03 21:30:02.986088254 389:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696930380 391:2023-08-03 21:30:03.018103409 392:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696933078 394:2023-08-03 21:30:03.050077392 395:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696935445 397:2023-08-03 21:30:03.074114392 398:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696938795 400:2023-08-03 21:30:03.114035084 401:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696940845 403:2023-08-03 21:30:03.142015391 404:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696943542 406:2023-08-03 21:30:03.174031239 407:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696946247 409:2023-08-03 21:30:03.210113551 410:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696949897 412:2023-08-03 21:30:03.242035265 413:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696952261 415:2023-08-03 21:30:03.274046961 416:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696954940 418:2023-08-03 21:30:03.306028723 419:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696957656 421:2023-08-03 21:30:03.338077555 422:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696960686 424:2023-08-03 21:30:03.374073915 425:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696963050 427:2023-08-03 21:30:03.402047343 428:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696966086 430:2023-08-03 21:30:03.434032881 431:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696968788 433:2023-08-03 21:30:03.466032340 434:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696970813 436:2023-08-03 21:30:03.494068821 437:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696973176 439:2023-08-03 21:30:03.522161212 440:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696975872 442:2023-08-03 21:30:03.554050153 443:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696978574 445:2023-08-03 21:30:03.586065001 446:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696980938 448:2023-08-03 21:30:03.614119644 449:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696983300 451:2023-08-03 21:30:03.638144992 452:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696986016 454:2023-08-03 21:30:03.674118268 455:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696988717 457:2023-08-03 21:30:03.710118733 458:some avg10=3.99 avg60=1.11 avg300=0.25 total=38696991745 460:2023-08-03 21:30:03.738066832 461:some avg10=3.99 avg60=1.11 avg300=0.25 total=38696994115 463:2023-08-03 21:30:03.770048303 464:some avg10=3.99 avg60=1.11 avg300=0.25 total=38696996451 466:2023-08-03 21:30:03.802092382 467:some avg10=3.99 avg60=1.11 avg300=0.25 total=38696999169 469:2023-08-03 21:30:03.826077226 470:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697001198 472:2023-08-03 21:30:03.858062788 473:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697003903 475:2023-08-03 21:30:03.890062930 476:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697006941 478:2023-08-03 21:30:03.922067512 479:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697009971 481:2023-08-03 21:30:03.958084669 482:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697013322 484:2023-08-03 21:30:03.994120043 485:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697015378 487:2023-08-03 21:30:04.018108794 488:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697017409 490:2023-08-03 21:30:04.042115598 491:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697019438 493:2023-08-03 21:30:04.066126937 494:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697021470 496:2023-08-03 21:30:04.090050277 497:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697023835 499:2023-08-03 21:30:04.122078760 500:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697026530 502:2023-08-03 21:30:04.158044418 503:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697029229 505:2023-08-03 21:30:04.186092879 506:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697032540 508:2023-08-03 21:30:04.233761652 509:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697035931 511:2023-08-03 21:30:04.262027484 512:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697038637 514:2023-08-03 21:30:04.294022638 515:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697041335 517:2023-08-03 21:30:04.322028009 518:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697044038 520:2023-08-03 21:30:04.354034481 521:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697046391 523:2023-08-03 21:30:04.386023486 524:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697048757 526:2023-08-03 21:30:04.414022848 527:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697051459 529:2023-08-03 21:30:04.442017832 530:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697053484 532:2023-08-03 21:30:04.466027364 533:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697056162 535:2023-08-03 21:30:04.502021088 536:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697058883 538:2023-08-03 21:30:04.534039153 539:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697061578 541:2023-08-03 21:30:04.566039103 542:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697063939 544:2023-08-03 21:30:04.598037462 545:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697066641 547:2023-08-03 21:30:04.629771191 548:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697070011 550:2023-08-03 21:30:04.666042007 551:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697072371 553:2023-08-03 21:30:04.698010368 554:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697075069 556:2023-08-03 21:30:04.722019998 557:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697077770 559:2023-08-03 21:30:04.762030636 560:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697080474 562:2023-08-03 21:30:04.786010720 563:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697083181 565:2023-08-03 21:30:04.818035029 566:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697085877 568:2023-08-03 21:30:04.854034446 569:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697088575 571:2023-08-03 21:30:04.882043053 572:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697091289 574:2023-08-03 21:30:04.922081980 575:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697093316 577:2023-08-03 21:30:04.927564632 578:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697094670 580:2023-08-03 21:30:04.962026471 581:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697097373 583:2023-08-03 21:30:04.990040191 584:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697099739 586:2023-08-03 21:30:05.014033886 587:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697101764 589:2023-08-03 21:30:05.039750750 590:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697104463 592:2023-08-03 21:30:05.074019117 593:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697106610 595:2023-08-03 21:30:05.110046072 596:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697109855 598:2023-08-03 21:30:05.138046014 599:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697112555 601:2023-08-03 21:30:05.166023497 602:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697115232 604:2023-08-03 21:30:05.206061564 605:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697118239 607:2023-08-03 21:30:05.238018646 608:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697120933 610:2023-08-03 21:30:05.270026372 611:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697123957 613:2023-08-03 21:30:05.302027288 614:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697126663 616:2023-08-03 21:30:05.334040250 617:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697129338 619:2023-08-03 21:30:05.360190632 620:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697131047 622:2023-08-03 21:30:05.390029162 623:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697133747 625:2023-08-03 21:30:05.414034794 626:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697135782 628:2023-08-03 21:30:05.427642722 629:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697137467 631:2023-08-03 21:30:05.458016115 632:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697139828 634:2023-08-03 21:30:05.486030455 635:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697142173 637:2023-08-03 21:30:05.514016836 638:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697144890 640:2023-08-03 21:30:05.546024372 641:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697147583 643:2023-08-03 21:30:05.584237211 644:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697150280 646:2023-08-03 21:30:05.614025918 647:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697152856 649:2023-08-03 21:30:05.646014419 650:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697155334 652:2023-08-03 21:30:05.670042958 653:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697157696 655:2023-08-03 21:30:05.706027504 656:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697160413 658:2023-08-03 21:30:05.730024921 659:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697163116 661:2023-08-03 21:30:05.770026829 662:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697165892 664:2023-08-03 21:30:05.806011996 665:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697168589 667:2023-08-03 21:30:05.830011568 668:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697171305 670:2023-08-03 21:30:05.862039144 671:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697173671 673:2023-08-03 21:30:05.898008773 674:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697177020 676:2023-08-03 21:30:05.933796165 677:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697179742 679:2023-08-03 21:30:05.970058151 680:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697182438 682:2023-08-03 21:30:05.994037819 683:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697184468 685:2023-08-03 21:30:06.026022629 686:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697187165 688:2023-08-03 21:30:06.050024626 689:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697189526 691:2023-08-03 21:30:06.078013808 692:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697192230 694:2023-08-03 21:30:06.114030646 695:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697194603 697:2023-08-03 21:30:06.142038629 698:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697197296 700:2023-08-03 21:30:06.170044171 701:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697199658 703:2023-08-03 21:30:06.205670334 704:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697202692 706:2023-08-03 21:30:06.238037490 707:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697206060 709:2023-08-03 21:30:06.270024332 710:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697208087 712:2023-08-03 21:30:06.301784118 713:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697211461 715:2023-08-03 21:30:06.342039416 716:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697214160 718:2023-08-03 21:30:06.374013677 719:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697216861 721:2023-08-03 21:30:06.406040803 722:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697219559 724:2023-08-03 21:30:06.438006057 725:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697222256 727:2023-08-03 21:30:06.470026240 728:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697225629 730:2023-08-03 21:30:06.502024527 731:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697228305 733:2023-08-03 21:30:06.538033228 734:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697230686 736:2023-08-03 21:30:06.566028157 737:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697233689 739:2023-08-03 21:30:06.602025064 740:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697236080 742:2023-08-03 21:30:06.630027618 743:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697238780 745:2023-08-03 21:30:06.662010496 746:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697241478 748:2023-08-03 21:30:06.690014164 749:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697243510 751:2023-08-03 21:30:06.714031649 752:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697246207 754:2023-08-03 21:30:06.750026276 755:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697248907 757:2023-08-03 21:30:06.786014937 758:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697251609 760:2023-08-03 21:30:06.818017495 761:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697254643 763:2023-08-03 21:30:06.846013970 764:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697256667 766:2023-08-03 21:30:06.878168821 767:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697259367 769:2023-08-03 21:30:06.902022147 770:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697262083 772:2023-08-03 21:30:06.937999183 773:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697264777 775:2023-08-03 21:30:06.970022469 776:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697267479 778:2023-08-03 21:30:07.006018171 779:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697270171 781:2023-08-03 21:30:07.030035755 782:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697272873 784:2023-08-03 21:30:07.070024602 785:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697275913 787:2023-08-03 21:30:07.098024161 788:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697277932 790:2023-08-03 21:30:07.126038057 791:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697280293 793:2023-08-03 21:30:07.150014689 794:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697282324 796:2023-08-03 21:30:07.174011609 797:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697284688 799:2023-08-03 21:30:07.206006119 800:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697287664 802:2023-08-03 21:30:07.242013736 803:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697290389 805:2023-08-03 21:30:07.273763791 806:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697293757 808:2023-08-03 21:30:07.306027619 809:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697296435 811:2023-08-03 21:30:07.338033801 812:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697299155 814:2023-08-03 21:30:07.370014272 815:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697301335 817:2023-08-03 21:30:07.398049887 818:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697303375 820:2023-08-03 21:30:07.426030463 821:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306238 823:2023-08-03 21:30:07.446012112 824:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 826:2023-08-03 21:30:07.449074039 827:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 829:2023-08-03 21:30:07.452094902 830:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 832:2023-08-03 21:30:07.455132499 833:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 835:2023-08-03 21:30:07.458158709 836:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 838:2023-08-03 21:30:07.461160555 839:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 841:2023-08-03 21:30:07.464164844 842:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 844:2023-08-03 21:30:07.467191143 845:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 847:2023-08-03 21:30:07.470249391 848:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 850:2023-08-03 21:30:07.473268040 851:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 853:2023-08-03 21:30:07.476260894 854:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 856:2023-08-03 21:30:07.479273385 857:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 859:2023-08-03 21:30:07.482341558 860:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 862:2023-08-03 21:30:07.485402936 863:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 865:2023-08-03 21:30:07.488621116 866:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 868:2023-08-03 21:30:07.491648106 869:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 871:2023-08-03 21:30:07.494712160 872:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 874:2023-08-03 21:30:07.497741518 875:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 877:2023-08-03 21:30:07.500731815 878:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 880:2023-08-03 21:30:07.503757142 881:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 883:2023-08-03 21:30:07.506830439 884:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 886:2023-08-03 21:30:07.509933769 887:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 889:2023-08-03 21:30:07.512924245 890:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 892:2023-08-03 21:30:07.515952487 893:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 895:2023-08-03 21:30:07.519039796 896:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 898:2023-08-03 21:30:07.522058287 899:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714
Hi Randy & Richard, The key problem here is that *it takes time for tasks to occupy resources*. And during that time, new tasks will be spawned. And this is the 'task spawning peaking issue' I referred to in the commit message. Even if we switch to a non-binary approach, as long as there's some criteria, this peaking issue will always be there. Because there will always exist a point where the load is 'low enough', or in other words, where 'things look good', and the new tasks start to spawn. I currently have no idea how we can solve this 'task spawning peaking issue'. Any suggestion? Regards, Qi On 8/6/23 01:29, Randy MacLeod wrote: > On 2023-08-03 03:56, Qi.Chen@windriver.com wrote: >> From: Chen Qi<Qi.Chen@windriver.com> >> >> The current logic is not correct because if the time interval >> between the current check and the last check is very small, the PSI >> checker is not likely to block things even if the system is heavy >> loaded. > Agreed, there is a problem but we do have to be careful to not > a make things too slow since the goal is to have maximum throughput > without having too much congestion and cpu scheduling and IO latency. >> In fact, if the time interval is very small, the result >> might be 0, resulting in any small value not having any effect. > Yes, the PSI variable are implemented as per cpu vars and > you can get the same result several times in a row, even when > > there is lots of CPU pressure. Eg: > > $ stress -c 25 -t 100 & > [1] 3851144 > stress: info: [3851144] dispatching hogs: 25 cpu, 0 io, 0 vm, 0 hdd > $ for i in `seq 200`; do date +"%Y-%m-%d %H:%M:%S.%N" >> /tmp/cpu; cat > /proc/pressure/cpu >> /tmp/cpu ; done > $ less /tmp/cpu > > See (1) below. > > I need to look at the PSI averaging algorithm again! > > >> To reproduce the current problem, set BB_PRESSURE_MAX_CPU/IO/MEMORY >> to "1", and start a build. We can see there's more than one tasks >> running throughout the whole build process. >> >> This patch improves things a little bit. It will block task spawning >> for at least 1s if the limits are exceeded, this is to overcome the >> 0 result issue. On the other hand, if the task spawning is not >> blocked, we check the pressure every time. >> >> However, even with this fix, we still have the task spawning peek issue. >> Because at some point, when the load is low enough, a few tasks will >> be spawned altogether. >> >> In summary, this patch improves the situation, but the peek task >> spawning peek issue is still there. > Richard and I were talking about this and he suggested > that we take a different approach of changing from a binary > approach to one where if we're close (say withing 10%) of the threshold, > we tell the bitbake scheduler: > > " Things don't look good, can you ask me again in a little while?" > > > Then if the congestion abates, we allow new task and if not, then > we've avoided introducing new load as we are ramping up. > Richard claims that this "ask again in a while" is something the > > scheduler already has code for. > > > This is essentially a smoothing algorithm, which I have little > experience. I did, as a student decades ago, try to control the > temperator of > a beaker of water and bring it from room temperature to a predefined goal > with minimal overshoot and reverberation! There are people with expertise > in this sort of thing, so if you are such a person, it would be good > to have > you appear out of the Yocto woodwork! > > I think we're doing something similar to the temperature controlled > beaker > except it's a not a fixed thermal load but rather a continuously > changing one. > > > I'm not sure if I've written this down before but I think there would > be some benefit, at least for a self-test, to have a series of stress-ng > tasks that we put into a recipe. One would bring the system up to > just below CPU (IO) congestion and keep the system in that state for a > ~5-10 > minutes. Then we'd run another N (= 10?), short-lived (30 s?) stress-ng -c > tests all at once. If the pressure regulation were working well, > the secondary stress-ng jobs would run serially and take N*Timeout to > be done. > > Of course, on a loaded system such as the YP autobuilders, this would > take even longer > but should never take < N*Timeout seconds. > > While this sounds like a good test to me (and kind of fun!), we should > remind ourselfs > that the larger task is to get make/ninja/cargo/etc to use a job pool > per bitbake instance > and/or patch them to read PSI info. > > We'd still have the congestion onset/abatement problem with those > tools though > so the problem is worth figuring out. > > For now, I'd need to see the before and after times an buildstats for > a build of say core-image-minimal > and/or poky world on a system that runs at 100% capacity for a good > part of the build without PSI > regulation. > > Data below. Sorry, I don't have time to analyze or graph it. > > > ../Randy > >> Signed-off-by: Chen Qi<Qi.Chen@windriver.com> >> --- >> bitbake/lib/bb/runqueue.py | 33 ++++++++++++++++++--------------- >> 1 file changed, 18 insertions(+), 15 deletions(-) >> >> diff --git a/bitbake/lib/bb/runqueue.py b/bitbake/lib/bb/runqueue.py >> index 48788f4aa6..7c482a55f3 100644 >> --- a/bitbake/lib/bb/runqueue.py >> +++ b/bitbake/lib/bb/runqueue.py >> @@ -179,6 +179,7 @@ class RunQueueScheduler(object): >> self.prev_memory_pressure = memory_pressure_fds.readline().split()[4].split("=")[1] >> self.prev_pressure_time = time.time() >> self.check_pressure = True >> + self.psi_exceeded = False >> except: >> bb.note("The /proc/pressure files can't be read. Continuing build without monitoring pressure") >> self.check_pressure = False >> @@ -191,6 +192,14 @@ class RunQueueScheduler(object): >> BB_PRESSURE_MAX_{CPU|IO|MEMORY} are set, return True if above threshold. >> """ >> if self.check_pressure: >> + now = time.time() >> + tdiff = now - self.prev_pressure_time >> + # If task spawning has been blocked, block it for at least 1s; >> + # otherwise, check the pressure ASAP. This is because if tdiff is very small >> + # the result of (curr_xxx_pressure - prev_xxx_pressure) might be 0, resulting >> + # in any small value not having effect. >> + if self.psi_exceeded and tdiff < 1.0: >> + return self.psi_exceeded >> with open("/proc/pressure/cpu") as cpu_pressure_fds, \ >> open("/proc/pressure/io") as io_pressure_fds, \ >> open("/proc/pressure/memory") as memory_pressure_fds: >> @@ -198,25 +207,19 @@ class RunQueueScheduler(object): >> curr_cpu_pressure = cpu_pressure_fds.readline().split()[4].split("=")[1] >> curr_io_pressure = io_pressure_fds.readline().split()[4].split("=")[1] >> curr_memory_pressure = memory_pressure_fds.readline().split()[4].split("=")[1] >> - now = time.time() >> - tdiff = now - self.prev_pressure_time >> - if tdiff > 1.0: >> - exceeds_cpu_pressure = self.rq.max_cpu_pressure and (float(curr_cpu_pressure) - float(self.prev_cpu_pressure)) / tdiff > self.rq.max_cpu_pressure >> - exceeds_io_pressure = self.rq.max_io_pressure and (float(curr_io_pressure) - float(self.prev_io_pressure)) / tdiff > self.rq.max_io_pressure >> - exceeds_memory_pressure = self.rq.max_memory_pressure and (float(curr_memory_pressure) - float(self.prev_memory_pressure)) / tdiff > self.rq.max_memory_pressure >> - self.prev_cpu_pressure = curr_cpu_pressure >> - self.prev_io_pressure = curr_io_pressure >> - self.prev_memory_pressure = curr_memory_pressure >> - self.prev_pressure_time = now >> - else: >> - exceeds_cpu_pressure = self.rq.max_cpu_pressure and (float(curr_cpu_pressure) - float(self.prev_cpu_pressure)) > self.rq.max_cpu_pressure >> - exceeds_io_pressure = self.rq.max_io_pressure and (float(curr_io_pressure) - float(self.prev_io_pressure)) > self.rq.max_io_pressure >> - exceeds_memory_pressure = self.rq.max_memory_pressure and (float(curr_memory_pressure) - float(self.prev_memory_pressure)) > self.rq.max_memory_pressure >> + exceeds_cpu_pressure = self.rq.max_cpu_pressure and (float(curr_cpu_pressure) - float(self.prev_cpu_pressure)) / tdiff > self.rq.max_cpu_pressure >> + exceeds_io_pressure = self.rq.max_io_pressure and (float(curr_io_pressure) - float(self.prev_io_pressure)) / tdiff > self.rq.max_io_pressure >> + exceeds_memory_pressure = self.rq.max_memory_pressure and (float(curr_memory_pressure) - float(self.prev_memory_pressure)) / tdiff > self.rq.max_memory_pressure >> + self.prev_cpu_pressure = curr_cpu_pressure >> + self.prev_io_pressure = curr_io_pressure >> + self.prev_memory_pressure = curr_memory_pressure >> + self.prev_pressure_time = now >> + self.psi_exceeded = exceeds_cpu_pressure or exceeds_io_pressure or exceeds_memory_pressure >> pressure_state = (exceeds_cpu_pressure, exceeds_io_pressure, exceeds_memory_pressure) >> if hasattr(self, "pressure_state") and pressure_state != self.pressure_state: >> bb.note("Pressure status changed to CPU: %s, IO: %s, Mem: %s" % pressure_state) >> self.pressure_state = pressure_state >> - return (exceeds_cpu_pressure or exceeds_io_pressure or exceeds_memory_pressure) >> + return self.psi_exceeded >> return False >> >> def next_buildable_task(self): > > > 1) > > $ rg "2023|some" /tmp/cpu > 1:2023-08-03 21:29:06.026561191 > 2:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 4:2023-08-03 21:29:06.031553426 > 5:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 7:2023-08-03 21:29:06.035901497 > 8:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 10:2023-08-03 21:29:06.040136284 > 11:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 13:2023-08-03 21:29:06.043794493 > 14:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 16:2023-08-03 21:29:06.047555871 > 17:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 19:2023-08-03 21:29:06.051353279 > 20:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 22:2023-08-03 21:29:06.055225669 > 23:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 25:2023-08-03 21:29:06.059436263 > 26:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 28:2023-08-03 21:29:06.063509822 > 29:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 31:2023-08-03 21:29:06.067917089 > 32:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 34:2023-08-03 21:29:06.071907717 > 35:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 37:2023-08-03 21:29:06.075865773 > 38:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 40:2023-08-03 21:29:06.079846668 > 41:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 43:2023-08-03 21:29:06.083961331 > 44:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 46:2023-08-03 21:29:06.087959027 > 47:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 49:2023-08-03 21:29:06.091544025 > 50:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 52:2023-08-03 21:29:06.095441172 > 53:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 55:2023-08-03 21:29:06.099397076 > 56:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 58:2023-08-03 21:29:06.103036659 > 59:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 61:2023-08-03 21:29:06.106987290 > 62:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 64:2023-08-03 21:29:06.110500839 > 65:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 67:2023-08-03 21:29:06.113993363 > 68:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 70:2023-08-03 21:29:06.117569797 > 71:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 73:2023-08-03 21:29:06.121977963 > 74:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 76:2023-08-03 21:29:06.125929934 > 77:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 79:2023-08-03 21:29:06.130064807 > 80:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 82:2023-08-03 21:29:06.134026786 > 83:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 85:2023-08-03 21:29:06.137735375 > 86:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 88:2023-08-03 21:29:06.141781398 > 89:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 91:2023-08-03 21:29:06.145287007 > 92:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 94:2023-08-03 21:29:06.149168307 > 95:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 97:2023-08-03 21:29:06.152818060 > 98:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 100:2023-08-03 21:29:06.156414605 > 101:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 103:2023-08-03 21:29:06.160560201 > 104:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 106:2023-08-03 21:29:06.164730602 > 107:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 109:2023-08-03 21:29:06.168689146 > 110:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 112:2023-08-03 21:29:06.172659034 > 113:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 115:2023-08-03 21:29:06.176217293 > 116:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 118:2023-08-03 21:29:06.179786393 > 119:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 121:2023-08-03 21:29:06.183386966 > 122:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 124:2023-08-03 21:29:06.186901117 > 125:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 127:2023-08-03 21:29:06.191092359 > 128:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 130:2023-08-03 21:29:06.194689321 > 131:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 133:2023-08-03 21:29:06.198680598 > 134:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 136:2023-08-03 21:29:06.202466314 > 137:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 139:2023-08-03 21:29:06.206159217 > 140:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 142:2023-08-03 21:29:06.210732253 > 143:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 145:2023-08-03 21:29:06.214495861 > 146:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 148:2023-08-03 21:29:06.218378986 > 149:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 151:2023-08-03 21:29:06.222715102 > 152:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 154:2023-08-03 21:29:06.226811631 > 155:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 157:2023-08-03 21:29:06.231578359 > 158:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 160:2023-08-03 21:29:06.235715687 > 161:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 163:2023-08-03 21:29:06.239569685 > 164:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 166:2023-08-03 21:29:06.243630675 > 167:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 169:2023-08-03 21:29:06.247567397 > 170:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 172:2023-08-03 21:29:06.251397908 > 173:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 175:2023-08-03 21:29:06.255011706 > 176:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 178:2023-08-03 21:29:06.258932029 > 179:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 181:2023-08-03 21:29:06.262697913 > 182:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 184:2023-08-03 21:29:06.266799173 > 185:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 187:2023-08-03 21:29:06.270925022 > 188:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 190:2023-08-03 21:29:06.275043244 > 191:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 193:2023-08-03 21:29:06.278632334 > 194:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 196:2023-08-03 21:29:06.282248451 > 197:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 199:2023-08-03 21:29:06.285993782 > 200:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 202:2023-08-03 21:29:06.290307906 > 203:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 205:2023-08-03 21:29:06.294324211 > 206:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 208:2023-08-03 21:29:06.298434915 > 209:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 211:2023-08-03 21:29:06.302588527 > 212:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 214:2023-08-03 21:29:06.306381068 > 215:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 217:2023-08-03 21:29:06.310380335 > 218:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 220:2023-08-03 21:29:06.315066360 > 221:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 223:2023-08-03 21:29:06.319399493 > 224:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 226:2023-08-03 21:29:06.323533769 > 227:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 229:2023-08-03 21:29:06.327623574 > 230:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 232:2023-08-03 21:29:06.331404094 > 233:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 235:2023-08-03 21:29:06.335282298 > 236:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 238:2023-08-03 21:29:06.339218007 > 239:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 241:2023-08-03 21:29:06.343022226 > 242:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 244:2023-08-03 21:29:06.346892658 > 245:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 247:2023-08-03 21:29:06.350737213 > 248:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 250:2023-08-03 21:29:06.354365299 > 251:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 253:2023-08-03 21:29:06.358504205 > 254:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 256:2023-08-03 21:29:06.362474648 > 257:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 259:2023-08-03 21:29:06.366500614 > 260:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 262:2023-08-03 21:29:06.370409578 > 263:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 265:2023-08-03 21:29:06.374728115 > 266:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 268:2023-08-03 21:29:06.379177687 > 269:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 271:2023-08-03 21:29:06.382958096 > 272:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 274:2023-08-03 21:29:06.386287192 > 275:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 277:2023-08-03 21:29:06.390148390 > 278:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 280:2023-08-03 21:29:06.393995408 > 281:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 283:2023-08-03 21:29:06.397538386 > 284:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 286:2023-08-03 21:29:06.401522567 > 287:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 289:2023-08-03 21:29:06.405142382 > 290:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 292:2023-08-03 21:29:06.408883272 > 293:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 295:2023-08-03 21:29:06.412675078 > 296:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 298:2023-08-03 21:29:06.416350916 > 299:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 301:2023-08-03 21:30:02.078140466 > 302:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696853256 > 304:2023-08-03 21:30:02.107326758 > 305:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696855634 > 307:2023-08-03 21:30:02.126028647 > 308:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696856111 > 310:2023-08-03 21:30:02.146028343 > 311:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696858681 > 313:2023-08-03 21:30:02.170021779 > 314:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696861693 > 316:2023-08-03 21:30:02.210054311 > 317:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696865034 > 319:2023-08-03 21:30:02.246006375 > 320:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696867390 > 322:2023-08-03 21:30:02.274025123 > 323:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696870090 > 325:2023-08-03 21:30:02.306023178 > 326:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696873121 > 328:2023-08-03 21:30:02.342042024 > 329:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696876150 > 331:2023-08-03 21:30:02.374045594 > 332:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696878182 > 334:2023-08-03 21:30:02.398024941 > 335:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696880542 > 337:2023-08-03 21:30:02.430025787 > 338:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696884229 > 340:2023-08-03 21:30:02.474025766 > 341:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696886618 > 343:2023-08-03 21:30:02.506036609 > 344:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696889650 > 346:2023-08-03 21:30:02.538005211 > 347:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696892350 > 349:2023-08-03 21:30:02.570002594 > 350:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696895386 > 352:2023-08-03 21:30:02.602024574 > 353:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696897412 > 355:2023-08-03 21:30:02.626006000 > 356:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696899777 > 358:2023-08-03 21:30:02.654031808 > 359:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696901808 > 361:2023-08-03 21:30:02.686002144 > 362:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696904511 > 364:2023-08-03 21:30:02.718074747 > 365:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696908199 > 367:2023-08-03 21:30:02.754015511 > 368:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696910921 > 370:2023-08-03 21:30:02.786022272 > 371:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696913625 > 373:2023-08-03 21:30:02.818041167 > 374:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696916321 > 376:2023-08-03 21:30:02.858023387 > 377:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696919750 > 379:2023-08-03 21:30:02.894034643 > 380:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696922454 > 382:2023-08-03 21:30:02.926147057 > 383:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696925319 > 385:2023-08-03 21:30:02.954068342 > 386:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696927350 > 388:2023-08-03 21:30:02.986088254 > 389:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696930380 > 391:2023-08-03 21:30:03.018103409 > 392:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696933078 > 394:2023-08-03 21:30:03.050077392 > 395:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696935445 > 397:2023-08-03 21:30:03.074114392 > 398:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696938795 > 400:2023-08-03 21:30:03.114035084 > 401:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696940845 > 403:2023-08-03 21:30:03.142015391 > 404:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696943542 > 406:2023-08-03 21:30:03.174031239 > 407:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696946247 > 409:2023-08-03 21:30:03.210113551 > 410:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696949897 > 412:2023-08-03 21:30:03.242035265 > 413:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696952261 > 415:2023-08-03 21:30:03.274046961 > 416:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696954940 > 418:2023-08-03 21:30:03.306028723 > 419:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696957656 > 421:2023-08-03 21:30:03.338077555 > 422:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696960686 > 424:2023-08-03 21:30:03.374073915 > 425:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696963050 > 427:2023-08-03 21:30:03.402047343 > 428:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696966086 > 430:2023-08-03 21:30:03.434032881 > 431:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696968788 > 433:2023-08-03 21:30:03.466032340 > 434:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696970813 > 436:2023-08-03 21:30:03.494068821 > 437:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696973176 > 439:2023-08-03 21:30:03.522161212 > 440:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696975872 > 442:2023-08-03 21:30:03.554050153 > 443:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696978574 > 445:2023-08-03 21:30:03.586065001 > 446:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696980938 > 448:2023-08-03 21:30:03.614119644 > 449:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696983300 > 451:2023-08-03 21:30:03.638144992 > 452:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696986016 > 454:2023-08-03 21:30:03.674118268 > 455:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696988717 > 457:2023-08-03 21:30:03.710118733 > 458:some avg10=3.99 avg60=1.11 avg300=0.25 total=38696991745 > 460:2023-08-03 21:30:03.738066832 > 461:some avg10=3.99 avg60=1.11 avg300=0.25 total=38696994115 > 463:2023-08-03 21:30:03.770048303 > 464:some avg10=3.99 avg60=1.11 avg300=0.25 total=38696996451 > 466:2023-08-03 21:30:03.802092382 > 467:some avg10=3.99 avg60=1.11 avg300=0.25 total=38696999169 > 469:2023-08-03 21:30:03.826077226 > 470:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697001198 > 472:2023-08-03 21:30:03.858062788 > 473:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697003903 > 475:2023-08-03 21:30:03.890062930 > 476:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697006941 > 478:2023-08-03 21:30:03.922067512 > 479:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697009971 > 481:2023-08-03 21:30:03.958084669 > 482:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697013322 > 484:2023-08-03 21:30:03.994120043 > 485:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697015378 > 487:2023-08-03 21:30:04.018108794 > 488:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697017409 > 490:2023-08-03 21:30:04.042115598 > 491:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697019438 > 493:2023-08-03 21:30:04.066126937 > 494:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697021470 > 496:2023-08-03 21:30:04.090050277 > 497:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697023835 > 499:2023-08-03 21:30:04.122078760 > 500:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697026530 > 502:2023-08-03 21:30:04.158044418 > 503:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697029229 > 505:2023-08-03 21:30:04.186092879 > 506:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697032540 > 508:2023-08-03 21:30:04.233761652 > 509:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697035931 > 511:2023-08-03 21:30:04.262027484 > 512:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697038637 > 514:2023-08-03 21:30:04.294022638 > 515:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697041335 > 517:2023-08-03 21:30:04.322028009 > 518:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697044038 > 520:2023-08-03 21:30:04.354034481 > 521:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697046391 > 523:2023-08-03 21:30:04.386023486 > 524:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697048757 > 526:2023-08-03 21:30:04.414022848 > 527:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697051459 > 529:2023-08-03 21:30:04.442017832 > 530:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697053484 > 532:2023-08-03 21:30:04.466027364 > 533:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697056162 > 535:2023-08-03 21:30:04.502021088 > 536:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697058883 > 538:2023-08-03 21:30:04.534039153 > 539:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697061578 > 541:2023-08-03 21:30:04.566039103 > 542:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697063939 > 544:2023-08-03 21:30:04.598037462 > 545:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697066641 > 547:2023-08-03 21:30:04.629771191 > 548:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697070011 > 550:2023-08-03 21:30:04.666042007 > 551:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697072371 > 553:2023-08-03 21:30:04.698010368 > 554:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697075069 > 556:2023-08-03 21:30:04.722019998 > 557:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697077770 > 559:2023-08-03 21:30:04.762030636 > 560:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697080474 > 562:2023-08-03 21:30:04.786010720 > 563:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697083181 > 565:2023-08-03 21:30:04.818035029 > 566:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697085877 > 568:2023-08-03 21:30:04.854034446 > 569:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697088575 > 571:2023-08-03 21:30:04.882043053 > 572:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697091289 > 574:2023-08-03 21:30:04.922081980 > 575:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697093316 > 577:2023-08-03 21:30:04.927564632 > 578:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697094670 > 580:2023-08-03 21:30:04.962026471 > 581:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697097373 > 583:2023-08-03 21:30:04.990040191 > 584:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697099739 > 586:2023-08-03 21:30:05.014033886 > 587:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697101764 > 589:2023-08-03 21:30:05.039750750 > 590:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697104463 > 592:2023-08-03 21:30:05.074019117 > 593:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697106610 > 595:2023-08-03 21:30:05.110046072 > 596:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697109855 > 598:2023-08-03 21:30:05.138046014 > 599:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697112555 > 601:2023-08-03 21:30:05.166023497 > 602:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697115232 > 604:2023-08-03 21:30:05.206061564 > 605:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697118239 > 607:2023-08-03 21:30:05.238018646 > 608:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697120933 > 610:2023-08-03 21:30:05.270026372 > 611:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697123957 > 613:2023-08-03 21:30:05.302027288 > 614:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697126663 > 616:2023-08-03 21:30:05.334040250 > 617:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697129338 > 619:2023-08-03 21:30:05.360190632 > 620:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697131047 > 622:2023-08-03 21:30:05.390029162 > 623:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697133747 > 625:2023-08-03 21:30:05.414034794 > 626:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697135782 > 628:2023-08-03 21:30:05.427642722 > 629:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697137467 > 631:2023-08-03 21:30:05.458016115 > 632:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697139828 > 634:2023-08-03 21:30:05.486030455 > 635:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697142173 > 637:2023-08-03 21:30:05.514016836 > 638:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697144890 > 640:2023-08-03 21:30:05.546024372 > 641:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697147583 > 643:2023-08-03 21:30:05.584237211 > 644:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697150280 > 646:2023-08-03 21:30:05.614025918 > 647:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697152856 > 649:2023-08-03 21:30:05.646014419 > 650:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697155334 > 652:2023-08-03 21:30:05.670042958 > 653:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697157696 > 655:2023-08-03 21:30:05.706027504 > 656:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697160413 > 658:2023-08-03 21:30:05.730024921 > 659:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697163116 > 661:2023-08-03 21:30:05.770026829 > 662:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697165892 > 664:2023-08-03 21:30:05.806011996 > 665:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697168589 > 667:2023-08-03 21:30:05.830011568 > 668:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697171305 > 670:2023-08-03 21:30:05.862039144 > 671:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697173671 > 673:2023-08-03 21:30:05.898008773 > 674:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697177020 > 676:2023-08-03 21:30:05.933796165 > 677:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697179742 > 679:2023-08-03 21:30:05.970058151 > 680:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697182438 > 682:2023-08-03 21:30:05.994037819 > 683:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697184468 > 685:2023-08-03 21:30:06.026022629 > 686:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697187165 > 688:2023-08-03 21:30:06.050024626 > 689:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697189526 > 691:2023-08-03 21:30:06.078013808 > 692:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697192230 > 694:2023-08-03 21:30:06.114030646 > 695:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697194603 > 697:2023-08-03 21:30:06.142038629 > 698:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697197296 > 700:2023-08-03 21:30:06.170044171 > 701:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697199658 > 703:2023-08-03 21:30:06.205670334 > 704:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697202692 > 706:2023-08-03 21:30:06.238037490 > 707:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697206060 > 709:2023-08-03 21:30:06.270024332 > 710:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697208087 > 712:2023-08-03 21:30:06.301784118 > 713:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697211461 > 715:2023-08-03 21:30:06.342039416 > 716:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697214160 > 718:2023-08-03 21:30:06.374013677 > 719:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697216861 > 721:2023-08-03 21:30:06.406040803 > 722:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697219559 > 724:2023-08-03 21:30:06.438006057 > 725:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697222256 > 727:2023-08-03 21:30:06.470026240 > 728:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697225629 > 730:2023-08-03 21:30:06.502024527 > 731:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697228305 > 733:2023-08-03 21:30:06.538033228 > 734:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697230686 > 736:2023-08-03 21:30:06.566028157 > 737:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697233689 > 739:2023-08-03 21:30:06.602025064 > 740:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697236080 > 742:2023-08-03 21:30:06.630027618 > 743:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697238780 > 745:2023-08-03 21:30:06.662010496 > 746:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697241478 > 748:2023-08-03 21:30:06.690014164 > 749:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697243510 > 751:2023-08-03 21:30:06.714031649 > 752:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697246207 > 754:2023-08-03 21:30:06.750026276 > 755:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697248907 > 757:2023-08-03 21:30:06.786014937 > 758:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697251609 > 760:2023-08-03 21:30:06.818017495 > 761:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697254643 > 763:2023-08-03 21:30:06.846013970 > 764:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697256667 > 766:2023-08-03 21:30:06.878168821 > 767:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697259367 > 769:2023-08-03 21:30:06.902022147 > 770:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697262083 > 772:2023-08-03 21:30:06.937999183 > 773:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697264777 > 775:2023-08-03 21:30:06.970022469 > 776:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697267479 > 778:2023-08-03 21:30:07.006018171 > 779:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697270171 > 781:2023-08-03 21:30:07.030035755 > 782:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697272873 > 784:2023-08-03 21:30:07.070024602 > 785:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697275913 > 787:2023-08-03 21:30:07.098024161 > 788:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697277932 > 790:2023-08-03 21:30:07.126038057 > 791:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697280293 > 793:2023-08-03 21:30:07.150014689 > 794:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697282324 > 796:2023-08-03 21:30:07.174011609 > 797:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697284688 > 799:2023-08-03 21:30:07.206006119 > 800:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697287664 > 802:2023-08-03 21:30:07.242013736 > 803:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697290389 > 805:2023-08-03 21:30:07.273763791 > 806:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697293757 > 808:2023-08-03 21:30:07.306027619 > 809:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697296435 > 811:2023-08-03 21:30:07.338033801 > 812:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697299155 > 814:2023-08-03 21:30:07.370014272 > 815:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697301335 > 817:2023-08-03 21:30:07.398049887 > 818:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697303375 > 820:2023-08-03 21:30:07.426030463 > 821:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306238 > 823:2023-08-03 21:30:07.446012112 > 824:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 > 826:2023-08-03 21:30:07.449074039 > 827:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 > 829:2023-08-03 21:30:07.452094902 > 830:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 > 832:2023-08-03 21:30:07.455132499 > 833:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 > 835:2023-08-03 21:30:07.458158709 > 836:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 > 838:2023-08-03 21:30:07.461160555 > 839:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 > 841:2023-08-03 21:30:07.464164844 > 842:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 > 844:2023-08-03 21:30:07.467191143 > 845:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 > 847:2023-08-03 21:30:07.470249391 > 848:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 > 850:2023-08-03 21:30:07.473268040 > 851:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 > 853:2023-08-03 21:30:07.476260894 > 854:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 > 856:2023-08-03 21:30:07.479273385 > 857:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 > 859:2023-08-03 21:30:07.482341558 > 860:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 > 862:2023-08-03 21:30:07.485402936 > 863:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 > 865:2023-08-03 21:30:07.488621116 > 866:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 > 868:2023-08-03 21:30:07.491648106 > 869:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 > 871:2023-08-03 21:30:07.494712160 > 872:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 > 874:2023-08-03 21:30:07.497741518 > 875:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 > 877:2023-08-03 21:30:07.500731815 > 878:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 > 880:2023-08-03 21:30:07.503757142 > 881:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 > 883:2023-08-03 21:30:07.506830439 > 884:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 > 886:2023-08-03 21:30:07.509933769 > 887:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 > 889:2023-08-03 21:30:07.512924245 > 890:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 > 892:2023-08-03 21:30:07.515952487 > 893:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 > 895:2023-08-03 21:30:07.519039796 > 896:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 > 898:2023-08-03 21:30:07.522058287 > 899:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 > > -- > # Randy MacLeod > # Wind River Linux
On 8/6/23 01:29, Randy MacLeod wrote: > On 2023-08-03 03:56, Qi.Chen@windriver.com wrote: >> From: Chen Qi<Qi.Chen@windriver.com> >> >> The current logic is not correct because if the time interval >> between the current check and the last check is very small, the PSI >> checker is not likely to block things even if the system is heavy >> loaded. > Agreed, there is a problem but we do have to be careful to not > a make things too slow since the goal is to have maximum throughput > without having too much congestion and cpu scheduling and IO latency. Hi Randy & Richard, Is the concern about the 1s blocking window? I'm thinking about making the blocking time calculated from the current pressure data & max setting value. The idea is (cur_pressure - max_value) / (cur_pressure + max_value). In this way, the blocking time is in the range (0, 1.0), the bigger cur_pressure exceeds the max setting, the longer tasks are blocked. What do you think? Regards, Qi >> In fact, if the time interval is very small, the result >> might be 0, resulting in any small value not having any effect. > Yes, the PSI variable are implemented as per cpu vars and > you can get the same result several times in a row, even when > > there is lots of CPU pressure. Eg: > > $ stress -c 25 -t 100 & > [1] 3851144 > stress: info: [3851144] dispatching hogs: 25 cpu, 0 io, 0 vm, 0 hdd > $ for i in `seq 200`; do date +"%Y-%m-%d %H:%M:%S.%N" >> /tmp/cpu; cat > /proc/pressure/cpu >> /tmp/cpu ; done > $ less /tmp/cpu > > See (1) below. > > I need to look at the PSI averaging algorithm again! > > >> To reproduce the current problem, set BB_PRESSURE_MAX_CPU/IO/MEMORY >> to "1", and start a build. We can see there's more than one tasks >> running throughout the whole build process. >> >> This patch improves things a little bit. It will block task spawning >> for at least 1s if the limits are exceeded, this is to overcome the >> 0 result issue. On the other hand, if the task spawning is not >> blocked, we check the pressure every time. >> >> However, even with this fix, we still have the task spawning peek issue. >> Because at some point, when the load is low enough, a few tasks will >> be spawned altogether. >> >> In summary, this patch improves the situation, but the peek task >> spawning peek issue is still there. > Richard and I were talking about this and he suggested > that we take a different approach of changing from a binary > approach to one where if we're close (say withing 10%) of the threshold, > we tell the bitbake scheduler: > > " Things don't look good, can you ask me again in a little while?" > > > Then if the congestion abates, we allow new task and if not, then > we've avoided introducing new load as we are ramping up. > Richard claims that this "ask again in a while" is something the > > scheduler already has code for. > > > This is essentially a smoothing algorithm, which I have little > experience. I did, as a student decades ago, try to control the > temperator of > a beaker of water and bring it from room temperature to a predefined goal > with minimal overshoot and reverberation! There are people with expertise > in this sort of thing, so if you are such a person, it would be good > to have > you appear out of the Yocto woodwork! > > I think we're doing something similar to the temperature controlled > beaker > except it's a not a fixed thermal load but rather a continuously > changing one. > > > I'm not sure if I've written this down before but I think there would > be some benefit, at least for a self-test, to have a series of stress-ng > tasks that we put into a recipe. One would bring the system up to > just below CPU (IO) congestion and keep the system in that state for a > ~5-10 > minutes. Then we'd run another N (= 10?), short-lived (30 s?) stress-ng -c > tests all at once. If the pressure regulation were working well, > the secondary stress-ng jobs would run serially and take N*Timeout to > be done. > > Of course, on a loaded system such as the YP autobuilders, this would > take even longer > but should never take < N*Timeout seconds. > > While this sounds like a good test to me (and kind of fun!), we should > remind ourselfs > that the larger task is to get make/ninja/cargo/etc to use a job pool > per bitbake instance > and/or patch them to read PSI info. > > We'd still have the congestion onset/abatement problem with those > tools though > so the problem is worth figuring out. > > For now, I'd need to see the before and after times an buildstats for > a build of say core-image-minimal > and/or poky world on a system that runs at 100% capacity for a good > part of the build without PSI > regulation. > > Data below. Sorry, I don't have time to analyze or graph it. > > > ../Randy > >> Signed-off-by: Chen Qi<Qi.Chen@windriver.com> >> --- >> bitbake/lib/bb/runqueue.py | 33 ++++++++++++++++++--------------- >> 1 file changed, 18 insertions(+), 15 deletions(-) >> >> diff --git a/bitbake/lib/bb/runqueue.py b/bitbake/lib/bb/runqueue.py >> index 48788f4aa6..7c482a55f3 100644 >> --- a/bitbake/lib/bb/runqueue.py >> +++ b/bitbake/lib/bb/runqueue.py >> @@ -179,6 +179,7 @@ class RunQueueScheduler(object): >> self.prev_memory_pressure = memory_pressure_fds.readline().split()[4].split("=")[1] >> self.prev_pressure_time = time.time() >> self.check_pressure = True >> + self.psi_exceeded = False >> except: >> bb.note("The /proc/pressure files can't be read. Continuing build without monitoring pressure") >> self.check_pressure = False >> @@ -191,6 +192,14 @@ class RunQueueScheduler(object): >> BB_PRESSURE_MAX_{CPU|IO|MEMORY} are set, return True if above threshold. >> """ >> if self.check_pressure: >> + now = time.time() >> + tdiff = now - self.prev_pressure_time >> + # If task spawning has been blocked, block it for at least 1s; >> + # otherwise, check the pressure ASAP. This is because if tdiff is very small >> + # the result of (curr_xxx_pressure - prev_xxx_pressure) might be 0, resulting >> + # in any small value not having effect. >> + if self.psi_exceeded and tdiff < 1.0: >> + return self.psi_exceeded >> with open("/proc/pressure/cpu") as cpu_pressure_fds, \ >> open("/proc/pressure/io") as io_pressure_fds, \ >> open("/proc/pressure/memory") as memory_pressure_fds: >> @@ -198,25 +207,19 @@ class RunQueueScheduler(object): >> curr_cpu_pressure = cpu_pressure_fds.readline().split()[4].split("=")[1] >> curr_io_pressure = io_pressure_fds.readline().split()[4].split("=")[1] >> curr_memory_pressure = memory_pressure_fds.readline().split()[4].split("=")[1] >> - now = time.time() >> - tdiff = now - self.prev_pressure_time >> - if tdiff > 1.0: >> - exceeds_cpu_pressure = self.rq.max_cpu_pressure and (float(curr_cpu_pressure) - float(self.prev_cpu_pressure)) / tdiff > self.rq.max_cpu_pressure >> - exceeds_io_pressure = self.rq.max_io_pressure and (float(curr_io_pressure) - float(self.prev_io_pressure)) / tdiff > self.rq.max_io_pressure >> - exceeds_memory_pressure = self.rq.max_memory_pressure and (float(curr_memory_pressure) - float(self.prev_memory_pressure)) / tdiff > self.rq.max_memory_pressure >> - self.prev_cpu_pressure = curr_cpu_pressure >> - self.prev_io_pressure = curr_io_pressure >> - self.prev_memory_pressure = curr_memory_pressure >> - self.prev_pressure_time = now >> - else: >> - exceeds_cpu_pressure = self.rq.max_cpu_pressure and (float(curr_cpu_pressure) - float(self.prev_cpu_pressure)) > self.rq.max_cpu_pressure >> - exceeds_io_pressure = self.rq.max_io_pressure and (float(curr_io_pressure) - float(self.prev_io_pressure)) > self.rq.max_io_pressure >> - exceeds_memory_pressure = self.rq.max_memory_pressure and (float(curr_memory_pressure) - float(self.prev_memory_pressure)) > self.rq.max_memory_pressure >> + exceeds_cpu_pressure = self.rq.max_cpu_pressure and (float(curr_cpu_pressure) - float(self.prev_cpu_pressure)) / tdiff > self.rq.max_cpu_pressure >> + exceeds_io_pressure = self.rq.max_io_pressure and (float(curr_io_pressure) - float(self.prev_io_pressure)) / tdiff > self.rq.max_io_pressure >> + exceeds_memory_pressure = self.rq.max_memory_pressure and (float(curr_memory_pressure) - float(self.prev_memory_pressure)) / tdiff > self.rq.max_memory_pressure >> + self.prev_cpu_pressure = curr_cpu_pressure >> + self.prev_io_pressure = curr_io_pressure >> + self.prev_memory_pressure = curr_memory_pressure >> + self.prev_pressure_time = now >> + self.psi_exceeded = exceeds_cpu_pressure or exceeds_io_pressure or exceeds_memory_pressure >> pressure_state = (exceeds_cpu_pressure, exceeds_io_pressure, exceeds_memory_pressure) >> if hasattr(self, "pressure_state") and pressure_state != self.pressure_state: >> bb.note("Pressure status changed to CPU: %s, IO: %s, Mem: %s" % pressure_state) >> self.pressure_state = pressure_state >> - return (exceeds_cpu_pressure or exceeds_io_pressure or exceeds_memory_pressure) >> + return self.psi_exceeded >> return False >> >> def next_buildable_task(self): > > > 1) > > $ rg "2023|some" /tmp/cpu > 1:2023-08-03 21:29:06.026561191 > 2:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 4:2023-08-03 21:29:06.031553426 > 5:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 7:2023-08-03 21:29:06.035901497 > 8:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 10:2023-08-03 21:29:06.040136284 > 11:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 13:2023-08-03 21:29:06.043794493 > 14:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 16:2023-08-03 21:29:06.047555871 > 17:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 19:2023-08-03 21:29:06.051353279 > 20:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 22:2023-08-03 21:29:06.055225669 > 23:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 25:2023-08-03 21:29:06.059436263 > 26:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 28:2023-08-03 21:29:06.063509822 > 29:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 31:2023-08-03 21:29:06.067917089 > 32:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 34:2023-08-03 21:29:06.071907717 > 35:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 37:2023-08-03 21:29:06.075865773 > 38:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 40:2023-08-03 21:29:06.079846668 > 41:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 43:2023-08-03 21:29:06.083961331 > 44:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 46:2023-08-03 21:29:06.087959027 > 47:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 49:2023-08-03 21:29:06.091544025 > 50:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 52:2023-08-03 21:29:06.095441172 > 53:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 55:2023-08-03 21:29:06.099397076 > 56:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 58:2023-08-03 21:29:06.103036659 > 59:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 61:2023-08-03 21:29:06.106987290 > 62:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 64:2023-08-03 21:29:06.110500839 > 65:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 67:2023-08-03 21:29:06.113993363 > 68:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 70:2023-08-03 21:29:06.117569797 > 71:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 73:2023-08-03 21:29:06.121977963 > 74:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 76:2023-08-03 21:29:06.125929934 > 77:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 79:2023-08-03 21:29:06.130064807 > 80:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 82:2023-08-03 21:29:06.134026786 > 83:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 85:2023-08-03 21:29:06.137735375 > 86:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 88:2023-08-03 21:29:06.141781398 > 89:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 91:2023-08-03 21:29:06.145287007 > 92:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 94:2023-08-03 21:29:06.149168307 > 95:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 97:2023-08-03 21:29:06.152818060 > 98:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 100:2023-08-03 21:29:06.156414605 > 101:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 103:2023-08-03 21:29:06.160560201 > 104:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 106:2023-08-03 21:29:06.164730602 > 107:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 109:2023-08-03 21:29:06.168689146 > 110:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 112:2023-08-03 21:29:06.172659034 > 113:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 115:2023-08-03 21:29:06.176217293 > 116:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 118:2023-08-03 21:29:06.179786393 > 119:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 121:2023-08-03 21:29:06.183386966 > 122:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 124:2023-08-03 21:29:06.186901117 > 125:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 127:2023-08-03 21:29:06.191092359 > 128:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 130:2023-08-03 21:29:06.194689321 > 131:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 133:2023-08-03 21:29:06.198680598 > 134:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 136:2023-08-03 21:29:06.202466314 > 137:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 139:2023-08-03 21:29:06.206159217 > 140:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 142:2023-08-03 21:29:06.210732253 > 143:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 145:2023-08-03 21:29:06.214495861 > 146:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 148:2023-08-03 21:29:06.218378986 > 149:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 151:2023-08-03 21:29:06.222715102 > 152:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 154:2023-08-03 21:29:06.226811631 > 155:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 157:2023-08-03 21:29:06.231578359 > 158:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 160:2023-08-03 21:29:06.235715687 > 161:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 163:2023-08-03 21:29:06.239569685 > 164:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 166:2023-08-03 21:29:06.243630675 > 167:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 169:2023-08-03 21:29:06.247567397 > 170:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 172:2023-08-03 21:29:06.251397908 > 173:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 175:2023-08-03 21:29:06.255011706 > 176:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 178:2023-08-03 21:29:06.258932029 > 179:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 181:2023-08-03 21:29:06.262697913 > 182:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 184:2023-08-03 21:29:06.266799173 > 185:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 187:2023-08-03 21:29:06.270925022 > 188:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 190:2023-08-03 21:29:06.275043244 > 191:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 193:2023-08-03 21:29:06.278632334 > 194:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 196:2023-08-03 21:29:06.282248451 > 197:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 199:2023-08-03 21:29:06.285993782 > 200:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 202:2023-08-03 21:29:06.290307906 > 203:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 205:2023-08-03 21:29:06.294324211 > 206:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 208:2023-08-03 21:29:06.298434915 > 209:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 211:2023-08-03 21:29:06.302588527 > 212:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 214:2023-08-03 21:29:06.306381068 > 215:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 217:2023-08-03 21:29:06.310380335 > 218:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 220:2023-08-03 21:29:06.315066360 > 221:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 223:2023-08-03 21:29:06.319399493 > 224:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 226:2023-08-03 21:29:06.323533769 > 227:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 229:2023-08-03 21:29:06.327623574 > 230:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 232:2023-08-03 21:29:06.331404094 > 233:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 235:2023-08-03 21:29:06.335282298 > 236:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 238:2023-08-03 21:29:06.339218007 > 239:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 241:2023-08-03 21:29:06.343022226 > 242:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 244:2023-08-03 21:29:06.346892658 > 245:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 247:2023-08-03 21:29:06.350737213 > 248:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 250:2023-08-03 21:29:06.354365299 > 251:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 253:2023-08-03 21:29:06.358504205 > 254:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 256:2023-08-03 21:29:06.362474648 > 257:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 259:2023-08-03 21:29:06.366500614 > 260:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 262:2023-08-03 21:29:06.370409578 > 263:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 265:2023-08-03 21:29:06.374728115 > 266:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 268:2023-08-03 21:29:06.379177687 > 269:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 271:2023-08-03 21:29:06.382958096 > 272:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 274:2023-08-03 21:29:06.386287192 > 275:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 277:2023-08-03 21:29:06.390148390 > 278:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 280:2023-08-03 21:29:06.393995408 > 281:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 283:2023-08-03 21:29:06.397538386 > 284:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 286:2023-08-03 21:29:06.401522567 > 287:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 289:2023-08-03 21:29:06.405142382 > 290:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 292:2023-08-03 21:29:06.408883272 > 293:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 295:2023-08-03 21:29:06.412675078 > 296:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 298:2023-08-03 21:29:06.416350916 > 299:some avg10=0.00 avg60=0.00 avg300=0.00 total=38696165984 > 301:2023-08-03 21:30:02.078140466 > 302:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696853256 > 304:2023-08-03 21:30:02.107326758 > 305:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696855634 > 307:2023-08-03 21:30:02.126028647 > 308:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696856111 > 310:2023-08-03 21:30:02.146028343 > 311:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696858681 > 313:2023-08-03 21:30:02.170021779 > 314:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696861693 > 316:2023-08-03 21:30:02.210054311 > 317:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696865034 > 319:2023-08-03 21:30:02.246006375 > 320:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696867390 > 322:2023-08-03 21:30:02.274025123 > 323:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696870090 > 325:2023-08-03 21:30:02.306023178 > 326:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696873121 > 328:2023-08-03 21:30:02.342042024 > 329:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696876150 > 331:2023-08-03 21:30:02.374045594 > 332:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696878182 > 334:2023-08-03 21:30:02.398024941 > 335:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696880542 > 337:2023-08-03 21:30:02.430025787 > 338:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696884229 > 340:2023-08-03 21:30:02.474025766 > 341:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696886618 > 343:2023-08-03 21:30:02.506036609 > 344:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696889650 > 346:2023-08-03 21:30:02.538005211 > 347:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696892350 > 349:2023-08-03 21:30:02.570002594 > 350:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696895386 > 352:2023-08-03 21:30:02.602024574 > 353:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696897412 > 355:2023-08-03 21:30:02.626006000 > 356:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696899777 > 358:2023-08-03 21:30:02.654031808 > 359:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696901808 > 361:2023-08-03 21:30:02.686002144 > 362:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696904511 > 364:2023-08-03 21:30:02.718074747 > 365:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696908199 > 367:2023-08-03 21:30:02.754015511 > 368:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696910921 > 370:2023-08-03 21:30:02.786022272 > 371:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696913625 > 373:2023-08-03 21:30:02.818041167 > 374:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696916321 > 376:2023-08-03 21:30:02.858023387 > 377:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696919750 > 379:2023-08-03 21:30:02.894034643 > 380:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696922454 > 382:2023-08-03 21:30:02.926147057 > 383:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696925319 > 385:2023-08-03 21:30:02.954068342 > 386:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696927350 > 388:2023-08-03 21:30:02.986088254 > 389:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696930380 > 391:2023-08-03 21:30:03.018103409 > 392:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696933078 > 394:2023-08-03 21:30:03.050077392 > 395:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696935445 > 397:2023-08-03 21:30:03.074114392 > 398:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696938795 > 400:2023-08-03 21:30:03.114035084 > 401:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696940845 > 403:2023-08-03 21:30:03.142015391 > 404:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696943542 > 406:2023-08-03 21:30:03.174031239 > 407:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696946247 > 409:2023-08-03 21:30:03.210113551 > 410:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696949897 > 412:2023-08-03 21:30:03.242035265 > 413:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696952261 > 415:2023-08-03 21:30:03.274046961 > 416:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696954940 > 418:2023-08-03 21:30:03.306028723 > 419:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696957656 > 421:2023-08-03 21:30:03.338077555 > 422:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696960686 > 424:2023-08-03 21:30:03.374073915 > 425:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696963050 > 427:2023-08-03 21:30:03.402047343 > 428:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696966086 > 430:2023-08-03 21:30:03.434032881 > 431:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696968788 > 433:2023-08-03 21:30:03.466032340 > 434:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696970813 > 436:2023-08-03 21:30:03.494068821 > 437:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696973176 > 439:2023-08-03 21:30:03.522161212 > 440:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696975872 > 442:2023-08-03 21:30:03.554050153 > 443:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696978574 > 445:2023-08-03 21:30:03.586065001 > 446:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696980938 > 448:2023-08-03 21:30:03.614119644 > 449:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696983300 > 451:2023-08-03 21:30:03.638144992 > 452:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696986016 > 454:2023-08-03 21:30:03.674118268 > 455:some avg10=3.10 avg60=0.88 avg300=0.20 total=38696988717 > 457:2023-08-03 21:30:03.710118733 > 458:some avg10=3.99 avg60=1.11 avg300=0.25 total=38696991745 > 460:2023-08-03 21:30:03.738066832 > 461:some avg10=3.99 avg60=1.11 avg300=0.25 total=38696994115 > 463:2023-08-03 21:30:03.770048303 > 464:some avg10=3.99 avg60=1.11 avg300=0.25 total=38696996451 > 466:2023-08-03 21:30:03.802092382 > 467:some avg10=3.99 avg60=1.11 avg300=0.25 total=38696999169 > 469:2023-08-03 21:30:03.826077226 > 470:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697001198 > 472:2023-08-03 21:30:03.858062788 > 473:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697003903 > 475:2023-08-03 21:30:03.890062930 > 476:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697006941 > 478:2023-08-03 21:30:03.922067512 > 479:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697009971 > 481:2023-08-03 21:30:03.958084669 > 482:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697013322 > 484:2023-08-03 21:30:03.994120043 > 485:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697015378 > 487:2023-08-03 21:30:04.018108794 > 488:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697017409 > 490:2023-08-03 21:30:04.042115598 > 491:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697019438 > 493:2023-08-03 21:30:04.066126937 > 494:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697021470 > 496:2023-08-03 21:30:04.090050277 > 497:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697023835 > 499:2023-08-03 21:30:04.122078760 > 500:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697026530 > 502:2023-08-03 21:30:04.158044418 > 503:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697029229 > 505:2023-08-03 21:30:04.186092879 > 506:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697032540 > 508:2023-08-03 21:30:04.233761652 > 509:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697035931 > 511:2023-08-03 21:30:04.262027484 > 512:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697038637 > 514:2023-08-03 21:30:04.294022638 > 515:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697041335 > 517:2023-08-03 21:30:04.322028009 > 518:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697044038 > 520:2023-08-03 21:30:04.354034481 > 521:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697046391 > 523:2023-08-03 21:30:04.386023486 > 524:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697048757 > 526:2023-08-03 21:30:04.414022848 > 527:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697051459 > 529:2023-08-03 21:30:04.442017832 > 530:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697053484 > 532:2023-08-03 21:30:04.466027364 > 533:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697056162 > 535:2023-08-03 21:30:04.502021088 > 536:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697058883 > 538:2023-08-03 21:30:04.534039153 > 539:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697061578 > 541:2023-08-03 21:30:04.566039103 > 542:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697063939 > 544:2023-08-03 21:30:04.598037462 > 545:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697066641 > 547:2023-08-03 21:30:04.629771191 > 548:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697070011 > 550:2023-08-03 21:30:04.666042007 > 551:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697072371 > 553:2023-08-03 21:30:04.698010368 > 554:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697075069 > 556:2023-08-03 21:30:04.722019998 > 557:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697077770 > 559:2023-08-03 21:30:04.762030636 > 560:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697080474 > 562:2023-08-03 21:30:04.786010720 > 563:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697083181 > 565:2023-08-03 21:30:04.818035029 > 566:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697085877 > 568:2023-08-03 21:30:04.854034446 > 569:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697088575 > 571:2023-08-03 21:30:04.882043053 > 572:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697091289 > 574:2023-08-03 21:30:04.922081980 > 575:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697093316 > 577:2023-08-03 21:30:04.927564632 > 578:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697094670 > 580:2023-08-03 21:30:04.962026471 > 581:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697097373 > 583:2023-08-03 21:30:04.990040191 > 584:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697099739 > 586:2023-08-03 21:30:05.014033886 > 587:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697101764 > 589:2023-08-03 21:30:05.039750750 > 590:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697104463 > 592:2023-08-03 21:30:05.074019117 > 593:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697106610 > 595:2023-08-03 21:30:05.110046072 > 596:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697109855 > 598:2023-08-03 21:30:05.138046014 > 599:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697112555 > 601:2023-08-03 21:30:05.166023497 > 602:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697115232 > 604:2023-08-03 21:30:05.206061564 > 605:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697118239 > 607:2023-08-03 21:30:05.238018646 > 608:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697120933 > 610:2023-08-03 21:30:05.270026372 > 611:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697123957 > 613:2023-08-03 21:30:05.302027288 > 614:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697126663 > 616:2023-08-03 21:30:05.334040250 > 617:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697129338 > 619:2023-08-03 21:30:05.360190632 > 620:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697131047 > 622:2023-08-03 21:30:05.390029162 > 623:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697133747 > 625:2023-08-03 21:30:05.414034794 > 626:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697135782 > 628:2023-08-03 21:30:05.427642722 > 629:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697137467 > 631:2023-08-03 21:30:05.458016115 > 632:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697139828 > 634:2023-08-03 21:30:05.486030455 > 635:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697142173 > 637:2023-08-03 21:30:05.514016836 > 638:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697144890 > 640:2023-08-03 21:30:05.546024372 > 641:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697147583 > 643:2023-08-03 21:30:05.584237211 > 644:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697150280 > 646:2023-08-03 21:30:05.614025918 > 647:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697152856 > 649:2023-08-03 21:30:05.646014419 > 650:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697155334 > 652:2023-08-03 21:30:05.670042958 > 653:some avg10=3.99 avg60=1.11 avg300=0.25 total=38697157696 > 655:2023-08-03 21:30:05.706027504 > 656:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697160413 > 658:2023-08-03 21:30:05.730024921 > 659:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697163116 > 661:2023-08-03 21:30:05.770026829 > 662:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697165892 > 664:2023-08-03 21:30:05.806011996 > 665:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697168589 > 667:2023-08-03 21:30:05.830011568 > 668:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697171305 > 670:2023-08-03 21:30:05.862039144 > 671:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697173671 > 673:2023-08-03 21:30:05.898008773 > 674:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697177020 > 676:2023-08-03 21:30:05.933796165 > 677:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697179742 > 679:2023-08-03 21:30:05.970058151 > 680:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697182438 > 682:2023-08-03 21:30:05.994037819 > 683:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697184468 > 685:2023-08-03 21:30:06.026022629 > 686:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697187165 > 688:2023-08-03 21:30:06.050024626 > 689:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697189526 > 691:2023-08-03 21:30:06.078013808 > 692:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697192230 > 694:2023-08-03 21:30:06.114030646 > 695:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697194603 > 697:2023-08-03 21:30:06.142038629 > 698:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697197296 > 700:2023-08-03 21:30:06.170044171 > 701:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697199658 > 703:2023-08-03 21:30:06.205670334 > 704:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697202692 > 706:2023-08-03 21:30:06.238037490 > 707:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697206060 > 709:2023-08-03 21:30:06.270024332 > 710:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697208087 > 712:2023-08-03 21:30:06.301784118 > 713:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697211461 > 715:2023-08-03 21:30:06.342039416 > 716:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697214160 > 718:2023-08-03 21:30:06.374013677 > 719:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697216861 > 721:2023-08-03 21:30:06.406040803 > 722:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697219559 > 724:2023-08-03 21:30:06.438006057 > 725:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697222256 > 727:2023-08-03 21:30:06.470026240 > 728:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697225629 > 730:2023-08-03 21:30:06.502024527 > 731:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697228305 > 733:2023-08-03 21:30:06.538033228 > 734:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697230686 > 736:2023-08-03 21:30:06.566028157 > 737:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697233689 > 739:2023-08-03 21:30:06.602025064 > 740:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697236080 > 742:2023-08-03 21:30:06.630027618 > 743:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697238780 > 745:2023-08-03 21:30:06.662010496 > 746:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697241478 > 748:2023-08-03 21:30:06.690014164 > 749:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697243510 > 751:2023-08-03 21:30:06.714031649 > 752:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697246207 > 754:2023-08-03 21:30:06.750026276 > 755:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697248907 > 757:2023-08-03 21:30:06.786014937 > 758:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697251609 > 760:2023-08-03 21:30:06.818017495 > 761:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697254643 > 763:2023-08-03 21:30:06.846013970 > 764:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697256667 > 766:2023-08-03 21:30:06.878168821 > 767:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697259367 > 769:2023-08-03 21:30:06.902022147 > 770:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697262083 > 772:2023-08-03 21:30:06.937999183 > 773:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697264777 > 775:2023-08-03 21:30:06.970022469 > 776:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697267479 > 778:2023-08-03 21:30:07.006018171 > 779:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697270171 > 781:2023-08-03 21:30:07.030035755 > 782:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697272873 > 784:2023-08-03 21:30:07.070024602 > 785:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697275913 > 787:2023-08-03 21:30:07.098024161 > 788:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697277932 > 790:2023-08-03 21:30:07.126038057 > 791:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697280293 > 793:2023-08-03 21:30:07.150014689 > 794:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697282324 > 796:2023-08-03 21:30:07.174011609 > 797:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697284688 > 799:2023-08-03 21:30:07.206006119 > 800:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697287664 > 802:2023-08-03 21:30:07.242013736 > 803:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697290389 > 805:2023-08-03 21:30:07.273763791 > 806:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697293757 > 808:2023-08-03 21:30:07.306027619 > 809:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697296435 > 811:2023-08-03 21:30:07.338033801 > 812:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697299155 > 814:2023-08-03 21:30:07.370014272 > 815:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697301335 > 817:2023-08-03 21:30:07.398049887 > 818:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697303375 > 820:2023-08-03 21:30:07.426030463 > 821:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306238 > 823:2023-08-03 21:30:07.446012112 > 824:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 > 826:2023-08-03 21:30:07.449074039 > 827:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 > 829:2023-08-03 21:30:07.452094902 > 830:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 > 832:2023-08-03 21:30:07.455132499 > 833:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 > 835:2023-08-03 21:30:07.458158709 > 836:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 > 838:2023-08-03 21:30:07.461160555 > 839:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 > 841:2023-08-03 21:30:07.464164844 > 842:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 > 844:2023-08-03 21:30:07.467191143 > 845:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 > 847:2023-08-03 21:30:07.470249391 > 848:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 > 850:2023-08-03 21:30:07.473268040 > 851:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 > 853:2023-08-03 21:30:07.476260894 > 854:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 > 856:2023-08-03 21:30:07.479273385 > 857:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 > 859:2023-08-03 21:30:07.482341558 > 860:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 > 862:2023-08-03 21:30:07.485402936 > 863:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 > 865:2023-08-03 21:30:07.488621116 > 866:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 > 868:2023-08-03 21:30:07.491648106 > 869:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 > 871:2023-08-03 21:30:07.494712160 > 872:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 > 874:2023-08-03 21:30:07.497741518 > 875:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 > 877:2023-08-03 21:30:07.500731815 > 878:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 > 880:2023-08-03 21:30:07.503757142 > 881:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 > 883:2023-08-03 21:30:07.506830439 > 884:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 > 886:2023-08-03 21:30:07.509933769 > 887:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 > 889:2023-08-03 21:30:07.512924245 > 890:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 > 892:2023-08-03 21:30:07.515952487 > 893:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 > 895:2023-08-03 21:30:07.519039796 > 896:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 > 898:2023-08-03 21:30:07.522058287 > 899:some avg10=4.71 avg60=1.34 avg300=0.30 total=38697306714 > > -- > # Randy MacLeod > # Wind River Linux
On Mon, 2023-08-07 at 17:06 +0800, ChenQi wrote: > On 8/6/23 01:29, Randy MacLeod wrote: > > On 2023-08-03 03:56, Qi.Chen@windriver.com wrote: > > > From: Chen Qi<Qi.Chen@windriver.com> > > > > > > The current logic is not correct because if the time interval > > > between the current check and the last check is very small, the PSI > > > checker is not likely to block things even if the system is heavy > > > loaded. > > Agreed, there is a problem but we do have to be careful to not > > a make things too slow since the goal is to have maximum throughput > > without having too much congestion and cpu scheduling and IO latency. > > Hi Randy & Richard, > > Is the concern about the 1s blocking window? > > I'm thinking about making the blocking time calculated from the current > pressure data & max setting value. > > The idea is (cur_pressure - max_value) / (cur_pressure + max_value). In > this way, the blocking time is in the range (0, 1.0), the bigger > cur_pressure exceeds the max setting, the longer tasks are blocked. > > What do you think? I was thinking of something a bit different. If cur_pressure is less than say 70% of max, we execute tasks as normal. When pressure is 70-90% of max, we can implement something like a 0.5s blocking time and then from 90% to max we can use 1s. The idea would be to only use the delay if we're close to the maximum pressure value. We'd need some experiments to determine the right percentages and blocking times. How does that idea sound? Cheers, Richard
Hi Richard, I implemented the logic you suggested below, but I found the 0 result issue remains. I also checked my solution suggested in the previous email, but the 0 result issue also remains. So I added more logs to check the PSI data behavior. (I really should have done this before!) My setting is: BB_PRESSURE_MAX_IO = "1" The result is a little surprising to me. See https://github.com/ChenQi1989/poky/blob/io-pressure-logs/console-latest.log and https://github.com/ChenQi1989/poky/blob/io-pressure-logs/io_pressre_grep.log which is derived from 'grep io_pressure console-latest.log'. There are lines like: WARNING: io_pressure is 50448.00396988443; tdiff = 1.0781397819519043 WARNING: io_pressure is 0.0; tdiff = 1.015068531036377 ... WARNING: io_pressure is 50448.00396988443; tdiff = 1.0781397819519043 WARNING: io_pressure is 0.0; tdiff = 1.015068531036377 ... WARNING: io_pressure is 124236.79015441194; tdiff = 0.9999372959136963 WARNING: io_pressure is 186346.268013355; tdiff = 1.0000200271606445 WARNING: io_pressure is 0.0; tdiff = 1.0000898838043213 WARNING: io_pressure is 0.0; tdiff = 0.07874345779418945 WARNING: io_pressure is 0.0; tdiff = 0.03340554237365723 WARNING: io_pressure is 0.0; tdiff = 0.04926180839538574 WARNING: io_pressure is 0.0; tdiff = 0.012816905975341797 WARNING: io_pressure is 54966.74998052512; tdiff = 0.11936306953430176 From these logs, I'm starting to wonder if not using the average10/60 PSI data is a right choice in the first place. Best Regards, Qi -----Original Message----- From: Richard Purdie <richard.purdie@linuxfoundation.org> Sent: Monday, August 7, 2023 9:33 PM To: Chen, Qi <Qi.Chen@windriver.com>; MacLeod, Randy <Randy.MacLeod@windriver.com>; bitbake-devel@lists.openembedded.org Subject: Re: [bitbake-devel][PATCH V4] runqueue.py: fix PSI check logic On Mon, 2023-08-07 at 17:06 +0800, ChenQi wrote: > On 8/6/23 01:29, Randy MacLeod wrote: > > On 2023-08-03 03:56, Qi.Chen@windriver.com wrote: > > > From: Chen Qi<Qi.Chen@windriver.com> > > > > > > The current logic is not correct because if the time interval > > > between the current check and the last check is very small, the > > > PSI checker is not likely to block things even if the system is > > > heavy loaded. > > Agreed, there is a problem but we do have to be careful to not a > > make things too slow since the goal is to have maximum throughput > > without having too much congestion and cpu scheduling and IO latency. > > Hi Randy & Richard, > > Is the concern about the 1s blocking window? > > I'm thinking about making the blocking time calculated from the > current pressure data & max setting value. > > The idea is (cur_pressure - max_value) / (cur_pressure + max_value). > In this way, the blocking time is in the range (0, 1.0), the bigger > cur_pressure exceeds the max setting, the longer tasks are blocked. > > What do you think? I was thinking of something a bit different. If cur_pressure is less than say 70% of max, we execute tasks as normal. When pressure is 70-90% of max, we can implement something like a 0.5s blocking time and then from 90% to max we can use 1s. The idea would be to only use the delay if we're close to the maximum pressure value. We'd need some experiments to determine the right percentages and blocking times. How does that idea sound? Cheers, Richard
Hi Richard & Randy, I found that the current codes' accumulation logic (by accident) can largely mitigate the 0 result problem. The only thing that needs to be fixed is the calculation. Here's some results with extra logging (the logging codes are removed before sending out patch). WARNING: io_pressure: 670460.2519375327; tdiff = 1.1002755165100098 WARNING: io_pressure: 993875.3919909863; tdiff = 0.10157108306884766 WARNING: io_pressure: 996208.4676606514; tdiff = 0.2030026912689209 WARNING: io_pressure: 993647.5471232203; tdiff = 0.30433225631713867 WARNING: io_pressure: 994384.11498443; tdiff = 0.38404273986816406 WARNING: io_pressure: 967088.1754989834; tdiff = 0.4916532039642334 WARNING: io_pressure: 971829.9324483782; tdiff = 0.5942356586456299 WARNING: io_pressure: 975313.2658167384; tdiff = 0.6967833042144775 WARNING: io_pressure: 970512.9149231451; tdiff = 0.7993566989898682 WARNING: io_pressure: 949640.4937089952; tdiff = 0.901923418045044 WARNING: io_pressure: 918329.1719976218; tdiff = 1.0045254230499268 WARNING: io_pressure: 312769.20515803626; tdiff = 0.10266995429992676 WARNING: io_pressure: 405440.13771108247; tdiff = 0.2054286003112793 WARNING: io_pressure: 432910.60682429565; tdiff = 0.3081629276275635 WARNING: io_pressure: 442150.1656405503; tdiff = 0.37966060638427734 WARNING: io_pressure: 456412.32497144747; tdiff = 0.487440824508667 WARNING: io_pressure: 459391.5767844752; tdiff = 0.5902829170227051 WARNING: io_pressure: 464544.29419806163; tdiff = 0.6930813789367676 WARNING: io_pressure: 459285.51696186996; tdiff = 0.7958099842071533 WARNING: io_pressure: 460694.17245974333; tdiff = 0.898526668548584 WARNING: io_pressure: 463365.750929515; tdiff = 1.00123929977417 WARNING: io_pressure: 499291.300533051; tdiff = 0.10269355773925781 WARNING: io_pressure: 498869.8346194204; tdiff = 0.20540428161621094 Regards, Qi -----Original Message----- From: bitbake-devel@lists.openembedded.org <bitbake-devel@lists.openembedded.org> On Behalf Of Chen Qi via lists.openembedded.org Sent: Tuesday, August 8, 2023 1:56 PM To: Richard Purdie <richard.purdie@linuxfoundation.org>; MacLeod, Randy <Randy.MacLeod@windriver.com>; bitbake-devel@lists.openembedded.org Subject: Re: [bitbake-devel][PATCH V4] runqueue.py: fix PSI check logic Hi Richard, I implemented the logic you suggested below, but I found the 0 result issue remains. I also checked my solution suggested in the previous email, but the 0 result issue also remains. So I added more logs to check the PSI data behavior. (I really should have done this before!) My setting is: BB_PRESSURE_MAX_IO = "1" The result is a little surprising to me. See https://github.com/ChenQi1989/poky/blob/io-pressure-logs/console-latest.log and https://github.com/ChenQi1989/poky/blob/io-pressure-logs/io_pressre_grep.log which is derived from 'grep io_pressure console-latest.log'. There are lines like: WARNING: io_pressure is 50448.00396988443; tdiff = 1.0781397819519043 WARNING: io_pressure is 0.0; tdiff = 1.015068531036377 ... WARNING: io_pressure is 50448.00396988443; tdiff = 1.0781397819519043 WARNING: io_pressure is 0.0; tdiff = 1.015068531036377 ... WARNING: io_pressure is 124236.79015441194; tdiff = 0.9999372959136963 WARNING: io_pressure is 186346.268013355; tdiff = 1.0000200271606445 WARNING: io_pressure is 0.0; tdiff = 1.0000898838043213 WARNING: io_pressure is 0.0; tdiff = 0.07874345779418945 WARNING: io_pressure is 0.0; tdiff = 0.03340554237365723 WARNING: io_pressure is 0.0; tdiff = 0.04926180839538574 WARNING: io_pressure is 0.0; tdiff = 0.012816905975341797 WARNING: io_pressure is 54966.74998052512; tdiff = 0.11936306953430176 From these logs, I'm starting to wonder if not using the average10/60 PSI data is a right choice in the first place. Best Regards, Qi -----Original Message----- From: Richard Purdie <richard.purdie@linuxfoundation.org> Sent: Monday, August 7, 2023 9:33 PM To: Chen, Qi <Qi.Chen@windriver.com>; MacLeod, Randy <Randy.MacLeod@windriver.com>; bitbake-devel@lists.openembedded.org Subject: Re: [bitbake-devel][PATCH V4] runqueue.py: fix PSI check logic On Mon, 2023-08-07 at 17:06 +0800, ChenQi wrote: > On 8/6/23 01:29, Randy MacLeod wrote: > > On 2023-08-03 03:56, Qi.Chen@windriver.com wrote: > > > From: Chen Qi<Qi.Chen@windriver.com> > > > > > > The current logic is not correct because if the time interval > > > between the current check and the last check is very small, the > > > PSI checker is not likely to block things even if the system is > > > heavy loaded. > > Agreed, there is a problem but we do have to be careful to not a > > make things too slow since the goal is to have maximum throughput > > without having too much congestion and cpu scheduling and IO latency. > > Hi Randy & Richard, > > Is the concern about the 1s blocking window? > > I'm thinking about making the blocking time calculated from the > current pressure data & max setting value. > > The idea is (cur_pressure - max_value) / (cur_pressure + max_value). > In this way, the blocking time is in the range (0, 1.0), the bigger > cur_pressure exceeds the max setting, the longer tasks are blocked. > > What do you think? I was thinking of something a bit different. If cur_pressure is less than say 70% of max, we execute tasks as normal. When pressure is 70-90% of max, we can implement something like a 0.5s blocking time and then from 90% to max we can use 1s. The idea would be to only use the delay if we're close to the maximum pressure value. We'd need some experiments to determine the right percentages and blocking times. How does that idea sound? Cheers, Richard
Hi Qi, On Tue, 2023-08-08 at 05:55 +0000, Chen Qi via lists.openembedded.org wrote: > I implemented the logic you suggested below, but I found the 0 result issue remains. > I also checked my solution suggested in the previous email, but the 0 result issue also remains. > So I added more logs to check the PSI data behavior. (I really should have done this before!) > My setting is: BB_PRESSURE_MAX_IO = "1" > > The result is a little surprising to me. > See https://github.com/ChenQi1989/poky/blob/io-pressure-logs/console-latest.log > and https://github.com/ChenQi1989/poky/blob/io-pressure-logs/io_pressre_grep.log which is derived from 'grep io_pressure console-latest.log'. > > There are lines like: > WARNING: io_pressure is 50448.00396988443; tdiff = 1.0781397819519043 > WARNING: io_pressure is 0.0; tdiff = 1.015068531036377 > ... > WARNING: io_pressure is 50448.00396988443; tdiff = 1.0781397819519043 > WARNING: io_pressure is 0.0; tdiff = 1.015068531036377 > ... > WARNING: io_pressure is 124236.79015441194; tdiff = 0.9999372959136963 > WARNING: io_pressure is 186346.268013355; tdiff = 1.0000200271606445 > WARNING: io_pressure is 0.0; tdiff = 1.0000898838043213 > WARNING: io_pressure is 0.0; tdiff = 0.07874345779418945 > WARNING: io_pressure is 0.0; tdiff = 0.03340554237365723 > WARNING: io_pressure is 0.0; tdiff = 0.04926180839538574 > WARNING: io_pressure is 0.0; tdiff = 0.012816905975341797 > WARNING: io_pressure is 54966.74998052512; tdiff = 0.11936306953430176 > > From these logs, I'm starting to wonder if not using the average10/60 PSI data is a right choice in the first place. I merged your v5 of the patch since that seemed to make sense to me but I'm really not sure about the PSI data itself. That is something Randy looked into in a lot more detail than I did... Cheers, Richard
On 2023-08-13 04:36, Richard Purdie wrote: > Hi Qi, > > On Tue, 2023-08-08 at 05:55 +0000, Chen Qi via lists.openembedded.org > wrote: >> I implemented the logic you suggested below, but I found the 0 result issue remains. >> I also checked my solution suggested in the previous email, but the 0 result issue also remains. >> So I added more logs to check the PSI data behavior. (I really should have done this before!) >> My setting is: BB_PRESSURE_MAX_IO = "1" >> >> The result is a little surprising to me. >> Seehttps://github.com/ChenQi1989/poky/blob/io-pressure-logs/console-latest.log >> andhttps://github.com/ChenQi1989/poky/blob/io-pressure-logs/io_pressre_grep.log which is derived from 'grep io_pressure console-latest.log'. >> >> There are lines like: >> WARNING: io_pressure is 50448.00396988443; tdiff = 1.0781397819519043 >> WARNING: io_pressure is 0.0; tdiff = 1.015068531036377 >> ... >> WARNING: io_pressure is 50448.00396988443; tdiff = 1.0781397819519043 >> WARNING: io_pressure is 0.0; tdiff = 1.015068531036377 >> ... >> WARNING: io_pressure is 124236.79015441194; tdiff = 0.9999372959136963 >> WARNING: io_pressure is 186346.268013355; tdiff = 1.0000200271606445 >> WARNING: io_pressure is 0.0; tdiff = 1.0000898838043213 >> WARNING: io_pressure is 0.0; tdiff = 0.07874345779418945 >> WARNING: io_pressure is 0.0; tdiff = 0.03340554237365723 >> WARNING: io_pressure is 0.0; tdiff = 0.04926180839538574 >> WARNING: io_pressure is 0.0; tdiff = 0.012816905975341797 >> WARNING: io_pressure is 54966.74998052512; tdiff = 0.11936306953430176 >> >> From these logs, I'm starting to wonder if not using the average10/60 PSI data is a right choice in the first place. Certainly 60 seconds is too long of an averaging time and we concluded that 10 seconds was too long last year as well but if we can come up with a test that shows that it's a good compromise, I don't object to that idea completely. > I merged your v5 of the patch since that seemed to make sense to me but > I'm really not sure about the PSI data itself. That is something Randy > looked into in a lot more detail than I did... I've been focusing on CPU pressure and while I like your idea that we can test the behaviour by setting the pressure limit (cpu, io) to 1 and checking if we only see only one task is running at a time, I don't think that's a reasonable expectation. It might work if we have a stress job running at the same time that keeps the system above the pressure threshold. Even then I'm not yet sure how to deal with the actual data from the PSI especially when we are polling it quite frequently. I suspect we should limit how often we read the PSI data. You have to remember that the value read from /proc/pressure/* isn't updated on each read and is the sum of N variables that indicate congestion in a given CPU I need to read the kernel code and lkml discussion again to understand why we see such large variation in Total congenstion! Let's ask a couple of simple questions: What does the cpu/io congestion data look like and how quickly after launching a task, does the kernel reflect the new congestion state? I was playing around on a 24 thread system by running enough cpu burners to have the system close but not yet experiencing contention. I picked 20 cpu eaters. Then, after collecting PSI info for a while, I'd start one more cpu eater every N seconds. I expected to see a nice square staircase but that's not what I saw and I'm not sure why yet. I was going to continue playing with that until I had figured it out but maybe you want to take a look as well. I just had two scripts (see below but beware that this was playing around so the scripts may differ sligthly from the graphs shown here.) 1) collect-psi (should this run as a real-time thread and be written in C to minimize impact?) 2) scale-cpu Green points are the differences in some CPU Total values Red line is the avg10 % from some CPU. I think I scaled the avg10 % manually and tried to align them until it was close enough to compare visually. It would be interesting to filter the raw Total data using your updated algorithm. You can see that the "stairs" are not nearly as sharp as expected. Zooming in, you see that the data is quite noisy: My next step was to under stand the impact of reading the PSI data every 100 ms rather than every 10 ms and to only launch a new stress cpu eater every 60 seconds or so in hope that the PSI data would be more staircase-like as expected. Of course bitbake wants to launch tasks as quickly as possible so we do seem to have a filtering problem. Note that every after the ramp up and down of additional cpu eaters has completed, there are still times when (systemd and ??) cause additional cpu contention but for short periods: All for tonight. ../Randy ❯ cat collect-psi date +"%T.%6N"; for s in `seq 5000`; do for m in `seq 100`; do date +"%s.%6N"; #echo $s $m; cat /proc/pressure/cpu sleep 0.01; done; done date +"%T.%6N"; yp-ab/2023/stress-ramp ❯ cat scale-cpu date +"%T.%6N"; date +"%s.%6N"; sleep 100 date +"%s.%6N"; stress -c 20 -t 2000 & sleep 500 #sleep 2 for s in `seq 50`; do date +"%s.%6N"; stress -c 1 -t 500 & sleep 20 done date +"%T.%6N"; sleep 1000 date +"%T.%6N"; sleep 100 date +"%T.%6N"; date +"%s.%6N"; #sleep 50 I don't have time to explain how I filtered the data you'd get from those scripts tonight but if you look below, you can probably figure it out: # produce total data (take diff in xmgrace) $ rg "^1691|some" data/5000-sec-100-delay-100ms.log | paste -sd ' \n' | sed -e 's/some.*=//' | awk -e '{print $1-1691625397 " " $2-40598616909}' > 5000-time-total-100-delay-100ms.dat #produce avg data $ rg "^1691|some" data/5000-sec-100-delay-100ms.log | paste -sd ' \n' | sed -e 's/some avg10=//' -e 's/avg60.*//' | awk -e '{print $1-1691625397 " " $2}' > 5000-time-avg-100-delay-100ms.dat > > Cheers, > > Richard
there is still something wrong with latest pressure regulation, I see very high cpu pressure fluctuation and often only a few bitbake tasks running when the load isn't so bad NOTE: Pressure status changed to CPU: False, IO: None, Mem: None (CPU: 0.0/1000.0, IO: 0.0/None, Mem: 0.0/None) NOTE: Running task 207 of 1315 (/OE/build/oe-core/openembedded-core/meta/recipes-devtools/gcc/gcc-source_13.2.bb: do_unpack) NOTE: Pressure status changed to CPU: True, IO: None, Mem: None (CPU: 152474.1/1000.0, IO: 0.0/None, Mem: 0.0/None) this is with Load average: 7.40 8.32 11.41 on 32c 64t CPU and small bitbake patch I've just sent to show the pressure numbers in the existing NOTE to easily see the pressure numbers also on AB or jenkins builds. Attached log is from building rust with just oe-core and: BB_PRESSURE_MAX_CPU = "1000" BB_NUMBER_THREADS = "8" BB_NUMBER_PARSE_THREADS = "64" PARALLEL_MAKE = "-j 70 -l 140" What surprises me a bit is that it was running just 1315 tasks and the pressure status changed due to CPU pressure 244 times with highest pressure almost 200K even with Load average staying relatively reasonable. Memory pressure stayed 0 whole time and IO pressure was over 1000 (IO not being used for pressure regulation) in 75 cases with highest value of 330K. I'll rerun some of my test-oe-build-time benchmarks to see if it generates significantly different pressure now compared to: https://github.com/shr-project/test-oe-build-time/commit/a388df10f8eaae4437cf9931f7ccc6f86f1acaea maybe something else triggers these spikes as it's on my regular desktop with chrome and other stuff running on background, will be interesting to see the pressure fluctuations from our jenkins or from AB builds, it's much more convenient than pressure graphs from buildstats as it shows the exact values bitbake was using. Is someone seeing numbers like this? Regards, On Mon, Aug 14, 2023 at 4:35 AM Randy MacLeod via lists.openembedded.org <randy.macleod=windriver.com@lists.openembedded.org> wrote: > On 2023-08-13 04:36, Richard Purdie wrote: > > Hi Qi, > > On Tue, 2023-08-08 at 05:55 +0000, Chen Qi via lists.openembedded.org > wrote: > > I implemented the logic you suggested below, but I found the 0 result issue remains. > I also checked my solution suggested in the previous email, but the 0 result issue also remains. > So I added more logs to check the PSI data behavior. (I really should have done this before!) > My setting is: BB_PRESSURE_MAX_IO = "1" > > The result is a little surprising to me. > See https://github.com/ChenQi1989/poky/blob/io-pressure-logs/console-latest.log > and https://github.com/ChenQi1989/poky/blob/io-pressure-logs/io_pressre_grep.log which is derived from 'grep io_pressure console-latest.log'. > > There are lines like: > WARNING: io_pressure is 50448.00396988443; tdiff = 1.0781397819519043 > WARNING: io_pressure is 0.0; tdiff = 1.015068531036377 > ... > WARNING: io_pressure is 50448.00396988443; tdiff = 1.0781397819519043 > WARNING: io_pressure is 0.0; tdiff = 1.015068531036377 > ... > WARNING: io_pressure is 124236.79015441194; tdiff = 0.9999372959136963 > WARNING: io_pressure is 186346.268013355; tdiff = 1.0000200271606445 > WARNING: io_pressure is 0.0; tdiff = 1.0000898838043213 > WARNING: io_pressure is 0.0; tdiff = 0.07874345779418945 > WARNING: io_pressure is 0.0; tdiff = 0.03340554237365723 > WARNING: io_pressure is 0.0; tdiff = 0.04926180839538574 > WARNING: io_pressure is 0.0; tdiff = 0.012816905975341797 > WARNING: io_pressure is 54966.74998052512; tdiff = 0.11936306953430176 > > From these logs, I'm starting to wonder if not using the average10/60 PSI data is a right choice in the first place. > > Certainly 60 seconds is too long of an averaging time and > we concluded that 10 seconds was too long last year as well but > if we can come up with a test that shows that it's a good compromise, > I don't object to that idea completely. > > I merged your v5 of the patch since that seemed to make sense to me but > I'm really not sure about the PSI data itself. That is something Randy > looked into in a lot more detail than I did... > > I've been focusing on CPU pressure and while I like your idea that > we can test the behaviour by setting the pressure limit (cpu, io) to 1 and > checking if we only see only one task is running at a time, I don't think > that's a reasonable > expectation. It might work if we have a stress job running at the same > time that > keeps the system above the pressure threshold. Even then I'm not > yet sure how to deal with the actual data from the PSI especially > when we are polling it quite frequently. I suspect we should limit how > often we read the PSI data. > You have to remember that the value read from /proc/pressure/* isn't > updated on each read and > is the sum of N variables that indicate congestion in a given CPU I need > to read the kernel code > and lkml discussion again to understand why we see such large variation in > Total congenstion! > > > Let's ask a couple of simple questions: > > What does the cpu/io congestion data look like and how quickly after > launching > a task, does the kernel reflect the new congestion state? > > I was playing around on a 24 thread system by running enough cpu burners > to have the system close but not yet experiencing contention. I picked 20 > cpu eaters. > Then, after collecting PSI info for a while, I'd start one more cpu eater > every N seconds. > I expected to see a nice square staircase but that's not what I saw and > I'm not sure why > yet. I was going to continue playing with that until I had figured it out > but maybe you > want to take a look as well. > > > I just had two scripts (see below but beware that this was playing around > so the scripts may > differ sligthly from the graphs shown here.) > > 1) collect-psi (should this run as a real-time thread and be written in C > to minimize impact?) > > 2) scale-cpu > > Green points are the differences in some CPU Total values > > Red line is the avg10 % from some CPU. > > I think I scaled the avg10 % manually and tried to align them until it was > close enough > to compare visually. > > It would be interesting to filter the raw Total data using your updated > algorithm. > > > You can see that the "stairs" are not nearly as sharp as expected. > Zooming in, you see that the data is quite noisy: > > > My next step was to under stand the impact of reading the PSI data every > 100 ms rather than > every 10 ms and to only launch a new stress cpu eater every 60 seconds or > so in hope that > > the PSI data would be more staircase-like as expected. Of course bitbake > wants to launch > tasks as quickly as possible so we do seem to have a filtering problem. > > Note that every after the ramp up and down of additional cpu eaters has > completed, there > are still times when (systemd and ??) cause additional cpu contention but > for short periods: > > > All for tonight. > > ../Randy > > > ❯ cat collect-psi > date +"%T.%6N"; > for s in `seq 5000`; do > for m in `seq 100`; do > date +"%s.%6N"; > #echo $s $m; > cat /proc/pressure/cpu > sleep 0.01; > done; > done > date +"%T.%6N"; > > > yp-ab/2023/stress-ramp > ❯ cat scale-cpu > date +"%T.%6N"; > date +"%s.%6N"; > sleep 100 > date +"%s.%6N"; > stress -c 20 -t 2000 & > sleep 500 > #sleep 2 > for s in `seq 50`; do > date +"%s.%6N"; > stress -c 1 -t 500 & > sleep 20 > done > date +"%T.%6N"; > sleep 1000 > date +"%T.%6N"; > sleep 100 > date +"%T.%6N"; > date +"%s.%6N"; > #sleep 50 > > > I don't have time to explain how I filtered the data you'd get from those > scripts tonight but > if you look below, you can probably figure it out: > > # produce total data (take diff in xmgrace) > > $ rg "^1691|some" data/5000-sec-100-delay-100ms.log | paste -sd ' \n' | > sed -e 's/some.*=//' | awk -e '{print $1-1691625397 " " $2-40598616909}' > > 5000-time-total-100-delay-100ms.dat > > #produce avg data > > $ rg "^1691|some" data/5000-sec-100-delay-100ms.log | paste -sd ' \n' | > sed -e 's/some avg10=//' -e 's/avg60.*//' | awk -e '{print $1-1691625397 " > " $2}' > 5000-time-avg-100-delay-100ms.dat > > > > Cheers, > > Richard > > > -- > # Randy MacLeod > # Wind River Linux > > > -=-=-=-=-=-=-=-=-=-=-=- > Links: You receive all messages sent to this group. > View/Reply Online (#14940): > https://lists.openembedded.org/g/bitbake-devel/message/14940 > Mute This Topic: https://lists.openembedded.org/mt/100522098/3617156 > Group Owner: bitbake-devel+owner@lists.openembedded.org > Unsubscribe: https://lists.openembedded.org/g/bitbake-devel/unsub [ > Martin.Jansa@gmail.com] > -=-=-=-=-=-=-=-=-=-=-=- > >
On 8/15/23 19:13, Martin Jansa wrote: > there is still something wrong with latest pressure regulation, I see > very high cpu pressure fluctuation and often only a few bitbake tasks > running when the load isn't so bad > > NOTE: Pressure status changed to CPU: False, IO: None, Mem: None (CPU: > 0.0/1000.0, IO: 0.0/None, Mem: 0.0/None) > NOTE: Running task 207 of 1315 > (/OE/build/oe-core/openembedded-core/meta/recipes-devtools/gcc/gcc-source_13.2.bb > <https://urldefense.com/v3/__http://gcc-source_13.2.bb__;!!AjveYdw8EvQ!ZH11JLTlXWSmJ5ByhY-jU6bW7d7GeULdZpDpwS_ZhiLnUrFlp0pCghhUePQ2R9UDV_allduZ9rqsNZ1SkFJpXNU$>:do_unpack) > NOTE: Pressure status changed to CPU: True, IO: None, Mem: None (CPU: > 152474.1/1000.0, IO: 0.0/None, Mem: 0.0/None) > > this is with Load average: 7.40 8.32 11.41 on 32c 64t CPU and small > bitbake patch I've just sent to show the pressure numbers in the > existing NOTE to easily see the pressure numbers also on AB or jenkins > builds. > > Attached log is from building rust with just oe-core and: > BB_PRESSURE_MAX_CPU = "1000" > BB_NUMBER_THREADS = "8" > BB_NUMBER_PARSE_THREADS = "64" > PARALLEL_MAKE = "-j 70 -l 140" > > What surprises me a bit is that it was running just 1315 tasks and the > pressure status changed due to CPU pressure 244 times with highest > pressure almost 200K even with Load average staying relatively > reasonable. Memory pressure stayed 0 whole time and IO pressure was > over 1000 (IO not being used for pressure regulation) in 75 cases with > highest value of 330K. > > I'll rerun some of my test-oe-build-time benchmarks to see if it > generates significantly different pressure now compared to: > https://github.com/shr-project/test-oe-build-time/commit/a388df10f8eaae4437cf9931f7ccc6f86f1acaea > <https://urldefense.com/v3/__https://github.com/shr-project/test-oe-build-time/commit/a388df10f8eaae4437cf9931f7ccc6f86f1acaea__;!!AjveYdw8EvQ!ZH11JLTlXWSmJ5ByhY-jU6bW7d7GeULdZpDpwS_ZhiLnUrFlp0pCghhUePQ2R9UDV_allduZ9rqsNZ1Sg-hn6Fk$> > > maybe something else triggers these spikes as it's on my regular > desktop with chrome and other stuff running on background, will be > interesting to see the pressure fluctuations from our jenkins or from > AB builds, it's much more convenient than pressure graphs from > buildstats as it shows the exact values bitbake was using. > > Is someone seeing numbers like this? Hi Martin, Definitely yes. https://github.com/ChenQi1989/poky/blob/io-pressure-logs/io_pressre_grep.log Please also see my previous reply: https://lists.openembedded.org/g/bitbake-devel/message/14924 Looking at https://docs.kernel.org/accounting/psi.html again, I found it says: "The kernel accepts window sizes ranging from 500ms to 10s, therefore min monitoring update interval is 50ms and max is 1s. Min limit is set to prevent overly frequent polling. Max limit is chosen as a high enough number after which monitors are most likely not needed and psi averages can be used instead." I'm wondering if we should switch to using the monitoring capability as stated in the doc. Besides, I think we should start to set values much larger than 1000, because 1000/1000,000 = 0.1%, which does not seem to be a reasonable warning level. And the average load '7.40 8.32 11.41' means 7.4% percent is stalled in the last 10s, which obviously is much larger than 0.1%. We should start to set values such as 50,000 (5%) or 100,000 (10%). Best Regards, Qi > > Regards, > > > > On Mon, Aug 14, 2023 at 4:35 AM Randy MacLeod via > lists.openembedded.org > <https://urldefense.com/v3/__http://lists.openembedded.org__;!!AjveYdw8EvQ!ZH11JLTlXWSmJ5ByhY-jU6bW7d7GeULdZpDpwS_ZhiLnUrFlp0pCghhUePQ2R9UDV_allduZ9rqsNZ1SiKvL7Sw$> > <randy.macleod=windriver.com@lists.openembedded.org> wrote: > > On 2023-08-13 04:36, Richard Purdie wrote: >> Hi Qi, >> >> On Tue, 2023-08-08 at 05:55 +0000, Chen Qi vialists.openembedded.org <https://urldefense.com/v3/__http://lists.openembedded.org__;!!AjveYdw8EvQ!ZH11JLTlXWSmJ5ByhY-jU6bW7d7GeULdZpDpwS_ZhiLnUrFlp0pCghhUePQ2R9UDV_allduZ9rqsNZ1SiKvL7Sw$> >> wrote: >>> I implemented the logic you suggested below, but I found the 0 result issue remains. >>> I also checked my solution suggested in the previous email, but the 0 result issue also remains. >>> So I added more logs to check the PSI data behavior. (I really should have done this before!) >>> My setting is: BB_PRESSURE_MAX_IO = "1" >>> >>> The result is a little surprising to me. >>> Seehttps://github.com/ChenQi1989/poky/blob/io-pressure-logs/console-latest.log <https://urldefense.com/v3/__https://github.com/ChenQi1989/poky/blob/io-pressure-logs/console-latest.log__;!!AjveYdw8EvQ!ZH11JLTlXWSmJ5ByhY-jU6bW7d7GeULdZpDpwS_ZhiLnUrFlp0pCghhUePQ2R9UDV_allduZ9rqsNZ1SNf7l49o$> >>> andhttps://github.com/ChenQi1989/poky/blob/io-pressure-logs/io_pressre_grep.log <https://urldefense.com/v3/__https://github.com/ChenQi1989/poky/blob/io-pressure-logs/io_pressre_grep.log__;!!AjveYdw8EvQ!ZH11JLTlXWSmJ5ByhY-jU6bW7d7GeULdZpDpwS_ZhiLnUrFlp0pCghhUePQ2R9UDV_allduZ9rqsNZ1Se3ZPZMY$> which is derived from 'grep io_pressure console-latest.log'. >>> >>> There are lines like: >>> WARNING: io_pressure is 50448.00396988443; tdiff = 1.0781397819519043 >>> WARNING: io_pressure is 0.0; tdiff = 1.015068531036377 >>> ... >>> WARNING: io_pressure is 50448.00396988443; tdiff = 1.0781397819519043 >>> WARNING: io_pressure is 0.0; tdiff = 1.015068531036377 >>> ... >>> WARNING: io_pressure is 124236.79015441194; tdiff = 0.9999372959136963 >>> WARNING: io_pressure is 186346.268013355; tdiff = 1.0000200271606445 >>> WARNING: io_pressure is 0.0; tdiff = 1.0000898838043213 >>> WARNING: io_pressure is 0.0; tdiff = 0.07874345779418945 >>> WARNING: io_pressure is 0.0; tdiff = 0.03340554237365723 >>> WARNING: io_pressure is 0.0; tdiff = 0.04926180839538574 >>> WARNING: io_pressure is 0.0; tdiff = 0.012816905975341797 >>> WARNING: io_pressure is 54966.74998052512; tdiff = 0.11936306953430176 >>> >>> >From these logs, I'm starting to wonder if not using the average10/60 PSI data is a right choice in the first place. > Certainly 60 seconds is too long of an averaging time and > we concluded that 10 seconds was too long last year as well but > if we can come up with a test that shows that it's a good compromise, > I don't object to that idea completely. > >> I merged your v5 of the patch since that seemed to make sense to me but >> I'm really not sure about the PSI data itself. That is something Randy >> looked into in a lot more detail than I did... > I've been focusing on CPU pressure and while I like your idea that > we can test the behaviour by setting the pressure limit (cpu, io) > to 1 and > checking if we only see only one task is running at a time, I > don't think that's a reasonable > expectation. It might work if we have a stress job running at the > same time that > keeps the system above the pressure threshold. Even then I'm not > yet sure how to deal with the actual data from the PSI especially > when we are polling it quite frequently. I suspect we should limit > how often we read the PSI data. > You have to remember that the value read from /proc/pressure/* > isn't updated on each read and > is the sum of N variables that indicate congestion in a given CPU > I need to read the kernel code > and lkml discussion again to understand why we see such large > variation in Total congenstion! > > > Let's ask a couple of simple questions: > > What does the cpu/io congestion data look like and how quickly > after launching > a task, does the kernel reflect the new congestion state? > > I was playing around on a 24 thread system by running enough cpu > burners > to have the system close but not yet experiencing contention. I > picked 20 cpu eaters. > Then, after collecting PSI info for a while, I'd start one more > cpu eater every N seconds. > I expected to see a nice square staircase but that's not what I > saw and I'm not sure why > yet. I was going to continue playing with that until I had figured > it out but maybe you > want to take a look as well. > > > I just had two scripts (see below but beware that this was playing > around so the scripts may > differ sligthly from the graphs shown here.) > > 1) collect-psi (should this run as a real-time thread and be > written in C to minimize impact?) > > 2) scale-cpu > > Green points are the differences in some CPU Total values > > Red line is the avg10 % from some CPU. > > I think I scaled the avg10 % manually and tried to align them > until it was close enough > to compare visually. > > It would be interesting to filter the raw Total data using your > updated algorithm. > > > You can see that the "stairs" are not nearly as sharp as expected. > Zooming in, you see that the data is quite noisy: > > > My next step was to under stand the impact of reading the PSI data > every 100 ms rather than > every 10 ms and to only launch a new stress cpu eater every 60 > seconds or so in hope that > > the PSI data would be more staircase-like as expected. Of course > bitbake wants to launch > tasks as quickly as possible so we do seem to have a filtering > problem. > > Note that every after the ramp up and down of additional cpu > eaters has completed, there > are still times when (systemd and ??) cause additional cpu > contention but for short periods: > > > All for tonight. > > ../Randy > > > ❯ cat collect-psi > date +"%T.%6N"; > for s in `seq 5000`; do > for m in `seq 100`; do > date +"%s.%6N"; > #echo $s $m; > cat /proc/pressure/cpu > sleep 0.01; > done; > done > date +"%T.%6N"; > > > yp-ab/2023/stress-ramp > ❯ cat scale-cpu > date +"%T.%6N"; > date +"%s.%6N"; > sleep 100 > date +"%s.%6N"; > stress -c 20 -t 2000 & > sleep 500 > #sleep 2 > for s in `seq 50`; do > date +"%s.%6N"; > stress -c 1 -t 500 & > sleep 20 > done > date +"%T.%6N"; > sleep 1000 > date +"%T.%6N"; > sleep 100 > date +"%T.%6N"; > date +"%s.%6N"; > #sleep 50 > > > I don't have time to explain how I filtered the data you'd get > from those scripts tonight but > if you look below, you can probably figure it out: > > # produce total data (take diff in xmgrace) > > $ rg "^1691|some" data/5000-sec-100-delay-100ms.log | paste -sd ' > \n' | sed -e 's/some.*=//' | awk -e '{print $1-1691625397 " " > $2-40598616909}' > 5000-time-total-100-delay-100ms.dat > > #produce avg data > > $ rg "^1691|some" data/5000-sec-100-delay-100ms.log | paste -sd ' > \n' | sed -e 's/some avg10=//' -e 's/avg60.*//' | awk -e '{print > $1-1691625397 " " $2}' > 5000-time-avg-100-delay-100ms.dat > > > >> Cheers, >> >> Richard > > > -- > # Randy MacLeod > # Wind River Linux > > > -=-=-=-=-=-=-=-=-=-=-=- > Links: You receive all messages sent to this group. > View/Reply Online (#14940): > https://lists.openembedded.org/g/bitbake-devel/message/14940 > <https://urldefense.com/v3/__https://lists.openembedded.org/g/bitbake-devel/message/14940__;!!AjveYdw8EvQ!ZH11JLTlXWSmJ5ByhY-jU6bW7d7GeULdZpDpwS_ZhiLnUrFlp0pCghhUePQ2R9UDV_allduZ9rqsNZ1SKxz3b_0$> > Mute This Topic: > https://lists.openembedded.org/mt/100522098/3617156 > <https://urldefense.com/v3/__https://lists.openembedded.org/mt/100522098/3617156__;!!AjveYdw8EvQ!ZH11JLTlXWSmJ5ByhY-jU6bW7d7GeULdZpDpwS_ZhiLnUrFlp0pCghhUePQ2R9UDV_allduZ9rqsNZ1SIHpjiA4$> > Group Owner: bitbake-devel+owner@lists.openembedded.org > <mailto:bitbake-devel%2Bowner@lists.openembedded.org> > Unsubscribe: https://lists.openembedded.org/g/bitbake-devel/unsub > <https://urldefense.com/v3/__https://lists.openembedded.org/g/bitbake-devel/unsub__;!!AjveYdw8EvQ!ZH11JLTlXWSmJ5ByhY-jU6bW7d7GeULdZpDpwS_ZhiLnUrFlp0pCghhUePQ2R9UDV_allduZ9rqsNZ1SZZ9zn1g$> > [Martin.Jansa@gmail.com] > -=-=-=-=-=-=-=-=-=-=-=- >
On 2023-08-15 23:29, ChenQi wrote: > On 8/15/23 19:13, Martin Jansa wrote: >> there is still something wrong with latest pressure regulation, I see >> very high cpu pressure fluctuation and often only a few bitbake tasks >> running when the load isn't so bad >> >> NOTE: Pressure status changed to CPU: False, IO: None, Mem: None >> (CPU: 0.0/1000.0, IO: 0.0/None, Mem: 0.0/None) >> NOTE: Running task 207 of 1315 >> (/OE/build/oe-core/openembedded-core/meta/recipes-devtools/gcc/gcc-source_13.2.bb >> <https://urldefense.com/v3/__http://gcc-source_13.2.bb__;!!AjveYdw8EvQ!ZH11JLTlXWSmJ5ByhY-jU6bW7d7GeULdZpDpwS_ZhiLnUrFlp0pCghhUePQ2R9UDV_allduZ9rqsNZ1SkFJpXNU$>:do_unpack) >> NOTE: Pressure status changed to CPU: True, IO: None, Mem: None (CPU: >> 152474.1/1000.0, IO: 0.0/None, Mem: 0.0/None) >> >> this is with Load average: 7.40 8.32 11.41 on 32c 64t CPU and small >> bitbake patch I've just sent to show the pressure numbers in the >> existing NOTE to easily see the pressure numbers also on AB or >> jenkins builds. >> >> Attached log is from building rust with just oe-core and: >> BB_PRESSURE_MAX_CPU = "1000" >> BB_NUMBER_THREADS = "8" >> BB_NUMBER_PARSE_THREADS = "64" >> PARALLEL_MAKE = "-j 70 -l 140" >> >> What surprises me a bit is that it was running just 1315 tasks and >> the pressure status changed due to CPU pressure 244 times with >> highest pressure almost 200K even with Load average staying >> relatively reasonable. Memory pressure stayed 0 whole time and IO >> pressure was over 1000 (IO not being used for pressure regulation) in >> 75 cases with highest value of 330K. >> >> I'll rerun some of my test-oe-build-time benchmarks to see if it >> generates significantly different pressure now compared to: >> https://github.com/shr-project/test-oe-build-time/commit/a388df10f8eaae4437cf9931f7ccc6f86f1acaea >> <https://urldefense.com/v3/__https://github.com/shr-project/test-oe-build-time/commit/a388df10f8eaae4437cf9931f7ccc6f86f1acaea__;!!AjveYdw8EvQ!ZH11JLTlXWSmJ5ByhY-jU6bW7d7GeULdZpDpwS_ZhiLnUrFlp0pCghhUePQ2R9UDV_allduZ9rqsNZ1Sg-hn6Fk$> >> >> maybe something else triggers these spikes as it's on my regular >> desktop with chrome and other stuff running on background, will be >> interesting to see the pressure fluctuations from our jenkins or from >> AB builds, it's much more convenient than pressure graphs from >> buildstats as it shows the exact values bitbake was using. >> >> Is someone seeing numbers like this? > > Hi Martin, > > Definitely yes. > https://github.com/ChenQi1989/poky/blob/io-pressure-logs/io_pressre_grep.log > > Please also see my previous reply: > https://lists.openembedded.org/g/bitbake-devel/message/14924 > > Looking at https://docs.kernel.org/accounting/psi.html again, I found > it says: > > "The kernel accepts window sizes ranging from 500ms to 10s, therefore > min monitoring update interval is 50ms and max is 1s. Min limit is set > to prevent overly frequent polling. Max limit is chosen as a high > enough number after which monitors are most likely not needed and psi > averages can be used instead." > > I'm wondering if we should switch to using the monitoring capability > as stated in the doc. > It would be nice but the notification API is root only, iirc. I agree that we should poll /proc/pressure less frequently likely by tracking when we were last called rather than having a dedicated thread. > Besides, I think we should start to set values much larger than 1000, > because 1000/1000,000 = 0.1%, which does not seem to be a reasonable > warning level. > Remember that it's an indication of being overloaded. See slide 9, Max Total pressure build times from: https://summit.yoctoproject.org/yocto-project-summit-2022-11/talk/7F8R9C/ > And the average load '7.40 8.32 11.41' means 7.4% percent is stalled > in the last 10s, which obviously is much larger than 0.1%. > I don't think so. This is load average so it's the average number of jobs that were ready to run over the last minute not a %. > > We should start to set values such as 50,000 (5%) or 100,000 (10%). > The ideal would be that there is 0% CPU/IO contention but 100% cpu usage. Practical limits can be a bit higher. I'll have more time to work on this feature now and over the coming weeks and months. I'll start by writing a test case but until we get a per-bitbake server job server in place, I'm not going to try to optimize things at the bitbake level. It's better than it was and probably okay for 4.3. The job server change will smooth things out substantially. ../Randy "rushing one more email out before calling it a day" MacLeod > Best Regards, > > Qi > > >> >> Regards, >> >> >> >> On Mon, Aug 14, 2023 at 4:35 AM Randy MacLeod via >> lists.openembedded.org >> <https://urldefense.com/v3/__http://lists.openembedded.org__;!!AjveYdw8EvQ!ZH11JLTlXWSmJ5ByhY-jU6bW7d7GeULdZpDpwS_ZhiLnUrFlp0pCghhUePQ2R9UDV_allduZ9rqsNZ1SiKvL7Sw$> >> <randy.macleod=windriver.com@lists.openembedded.org> wrote: >> >> On 2023-08-13 04:36, Richard Purdie wrote: >>> Hi Qi, >>> >>> On Tue, 2023-08-08 at 05:55 +0000, Chen Qi via lists.openembedded.org >>> wrote: >>>> I implemented the logic you suggested below, but I found the 0 result issue remains. >>>> I also checked my solution suggested in the previous email, but the 0 result issue also remains. >>>> So I added more logs to check the PSI data behavior. (I really should have done this before!) >>>> My setting is: BB_PRESSURE_MAX_IO = "1" >>>> >>>> The result is a little surprising to me. >>>> Seehttps://github.com/ChenQi1989/poky/blob/io-pressure-logs/console-latest.log >>>> andhttps://github.com/ChenQi1989/poky/blob/io-pressure-logs/io_pressre_grep.log which is derived from 'grep io_pressure console-latest.log'. >>>> >>>> There are lines like: >>>> WARNING: io_pressure is 50448.00396988443; tdiff = 1.0781397819519043 >>>> WARNING: io_pressure is 0.0; tdiff = 1.015068531036377 >>>> ... >>>> WARNING: io_pressure is 50448.00396988443; tdiff = 1.0781397819519043 >>>> WARNING: io_pressure is 0.0; tdiff = 1.015068531036377 >>>> ... >>>> WARNING: io_pressure is 124236.79015441194; tdiff = 0.9999372959136963 >>>> WARNING: io_pressure is 186346.268013355; tdiff = 1.0000200271606445 >>>> WARNING: io_pressure is 0.0; tdiff = 1.0000898838043213 >>>> WARNING: io_pressure is 0.0; tdiff = 0.07874345779418945 >>>> WARNING: io_pressure is 0.0; tdiff = 0.03340554237365723 >>>> WARNING: io_pressure is 0.0; tdiff = 0.04926180839538574 >>>> WARNING: io_pressure is 0.0; tdiff = 0.012816905975341797 >>>> WARNING: io_pressure is 54966.74998052512; tdiff = 0.11936306953430176 >>>> >>>> >From these logs, I'm starting to wonder if not using the average10/60 PSI data is a right choice in the first place. >> Certainly 60 seconds is too long of an averaging time and >> we concluded that 10 seconds was too long last year as well but >> if we can come up with a test that shows that it's a good >> compromise, >> I don't object to that idea completely. >> >>> I merged your v5 of the patch since that seemed to make sense to me but >>> I'm really not sure about the PSI data itself. That is something Randy >>> looked into in a lot more detail than I did... >> I've been focusing on CPU pressure and while I like your idea that >> we can test the behaviour by setting the pressure limit (cpu, io) >> to 1 and >> checking if we only see only one task is running at a time, I >> don't think that's a reasonable >> expectation. It might work if we have a stress job running at the >> same time that >> keeps the system above the pressure threshold. Even then I'm not >> yet sure how to deal with the actual data from the PSI especially >> when we are polling it quite frequently. I suspect we should >> limit how often we read the PSI data. >> You have to remember that the value read from /proc/pressure/* >> isn't updated on each read and >> is the sum of N variables that indicate congestion in a given CPU >> I need to read the kernel code >> and lkml discussion again to understand why we see such large >> variation in Total congenstion! >> >> >> Let's ask a couple of simple questions: >> >> What does the cpu/io congestion data look like and how quickly >> after launching >> a task, does the kernel reflect the new congestion state? >> >> I was playing around on a 24 thread system by running enough cpu >> burners >> to have the system close but not yet experiencing contention. I >> picked 20 cpu eaters. >> Then, after collecting PSI info for a while, I'd start one more >> cpu eater every N seconds. >> I expected to see a nice square staircase but that's not what I >> saw and I'm not sure why >> yet. I was going to continue playing with that until I had >> figured it out but maybe you >> want to take a look as well. >> >> >> I just had two scripts (see below but beware that this was >> playing around so the scripts may >> differ sligthly from the graphs shown here.) >> >> 1) collect-psi (should this run as a real-time thread and be >> written in C to minimize impact?) >> >> 2) scale-cpu >> >> Green points are the differences in some CPU Total values >> >> Red line is the avg10 % from some CPU. >> >> I think I scaled the avg10 % manually and tried to align them >> until it was close enough >> to compare visually. >> >> It would be interesting to filter the raw Total data using your >> updated algorithm. >> >> >> You can see that the "stairs" are not nearly as sharp as expected. >> Zooming in, you see that the data is quite noisy: >> >> >> My next step was to under stand the impact of reading the PSI >> data every 100 ms rather than >> every 10 ms and to only launch a new stress cpu eater every 60 >> seconds or so in hope that >> >> the PSI data would be more staircase-like as expected. Of course >> bitbake wants to launch >> tasks as quickly as possible so we do seem to have a filtering >> problem. >> >> Note that every after the ramp up and down of additional cpu >> eaters has completed, there >> are still times when (systemd and ??) cause additional cpu >> contention but for short periods: >> >> >> All for tonight. >> >> ../Randy >> >> >> ❯ cat collect-psi >> date +"%T.%6N"; >> for s in `seq 5000`; do >> for m in `seq 100`; do >> date +"%s.%6N"; >> #echo $s $m; >> cat /proc/pressure/cpu >> sleep 0.01; >> done; >> done >> date +"%T.%6N"; >> >> >> yp-ab/2023/stress-ramp >> ❯ cat scale-cpu >> date +"%T.%6N"; >> date +"%s.%6N"; >> sleep 100 >> date +"%s.%6N"; >> stress -c 20 -t 2000 & >> sleep 500 >> #sleep 2 >> for s in `seq 50`; do >> date +"%s.%6N"; >> stress -c 1 -t 500 & >> sleep 20 >> done >> date +"%T.%6N"; >> sleep 1000 >> date +"%T.%6N"; >> sleep 100 >> date +"%T.%6N"; >> date +"%s.%6N"; >> #sleep 50 >> >> >> I don't have time to explain how I filtered the data you'd get >> from those scripts tonight but >> if you look below, you can probably figure it out: >> >> # produce total data (take diff in xmgrace) >> >> $ rg "^1691|some" data/5000-sec-100-delay-100ms.log | paste -sd ' >> \n' | sed -e 's/some.*=//' | awk -e '{print $1-1691625397 " " >> $2-40598616909}' > 5000-time-total-100-delay-100ms.dat >> >> #produce avg data >> >> $ rg "^1691|some" data/5000-sec-100-delay-100ms.log | paste -sd ' >> \n' | sed -e 's/some avg10=//' -e 's/avg60.*//' | awk -e '{print >> $1-1691625397 " " $2}' > 5000-time-avg-100-delay-100ms.dat >> >> >> >>> Cheers, >>> >>> Richard >> >> >> -- >> # Randy MacLeod >> # Wind River Linux >> >> >> -=-=-=-=-=-=-=-=-=-=-=- >> Links: You receive all messages sent to this group. >> View/Reply Online (#14940): >> https://lists.openembedded.org/g/bitbake-devel/message/14940 >> <https://urldefense.com/v3/__https://lists.openembedded.org/g/bitbake-devel/message/14940__;!!AjveYdw8EvQ!ZH11JLTlXWSmJ5ByhY-jU6bW7d7GeULdZpDpwS_ZhiLnUrFlp0pCghhUePQ2R9UDV_allduZ9rqsNZ1SKxz3b_0$> >> Mute This Topic: >> https://lists.openembedded.org/mt/100522098/3617156 >> <https://urldefense.com/v3/__https://lists.openembedded.org/mt/100522098/3617156__;!!AjveYdw8EvQ!ZH11JLTlXWSmJ5ByhY-jU6bW7d7GeULdZpDpwS_ZhiLnUrFlp0pCghhUePQ2R9UDV_allduZ9rqsNZ1SIHpjiA4$> >> Group Owner: bitbake-devel+owner@lists.openembedded.org >> <mailto:bitbake-devel%2Bowner@lists.openembedded.org> >> Unsubscribe: https://lists.openembedded.org/g/bitbake-devel/unsub >> <https://urldefense.com/v3/__https://lists.openembedded.org/g/bitbake-devel/unsub__;!!AjveYdw8EvQ!ZH11JLTlXWSmJ5ByhY-jU6bW7d7GeULdZpDpwS_ZhiLnUrFlp0pCghhUePQ2R9UDV_allduZ9rqsNZ1SZZ9zn1g$> >> [Martin.Jansa@gmail.com] >> -=-=-=-=-=-=-=-=-=-=-=- >> >
diff --git a/bitbake/lib/bb/runqueue.py b/bitbake/lib/bb/runqueue.py index 48788f4aa6..7c482a55f3 100644 --- a/bitbake/lib/bb/runqueue.py +++ b/bitbake/lib/bb/runqueue.py @@ -179,6 +179,7 @@ class RunQueueScheduler(object): self.prev_memory_pressure = memory_pressure_fds.readline().split()[4].split("=")[1] self.prev_pressure_time = time.time() self.check_pressure = True + self.psi_exceeded = False except: bb.note("The /proc/pressure files can't be read. Continuing build without monitoring pressure") self.check_pressure = False @@ -191,6 +192,14 @@ class RunQueueScheduler(object): BB_PRESSURE_MAX_{CPU|IO|MEMORY} are set, return True if above threshold. """ if self.check_pressure: + now = time.time() + tdiff = now - self.prev_pressure_time + # If task spawning has been blocked, block it for at least 1s; + # otherwise, check the pressure ASAP. This is because if tdiff is very small + # the result of (curr_xxx_pressure - prev_xxx_pressure) might be 0, resulting + # in any small value not having effect. + if self.psi_exceeded and tdiff < 1.0: + return self.psi_exceeded with open("/proc/pressure/cpu") as cpu_pressure_fds, \ open("/proc/pressure/io") as io_pressure_fds, \ open("/proc/pressure/memory") as memory_pressure_fds: @@ -198,25 +207,19 @@ class RunQueueScheduler(object): curr_cpu_pressure = cpu_pressure_fds.readline().split()[4].split("=")[1] curr_io_pressure = io_pressure_fds.readline().split()[4].split("=")[1] curr_memory_pressure = memory_pressure_fds.readline().split()[4].split("=")[1] - now = time.time() - tdiff = now - self.prev_pressure_time - if tdiff > 1.0: - exceeds_cpu_pressure = self.rq.max_cpu_pressure and (float(curr_cpu_pressure) - float(self.prev_cpu_pressure)) / tdiff > self.rq.max_cpu_pressure - exceeds_io_pressure = self.rq.max_io_pressure and (float(curr_io_pressure) - float(self.prev_io_pressure)) / tdiff > self.rq.max_io_pressure - exceeds_memory_pressure = self.rq.max_memory_pressure and (float(curr_memory_pressure) - float(self.prev_memory_pressure)) / tdiff > self.rq.max_memory_pressure - self.prev_cpu_pressure = curr_cpu_pressure - self.prev_io_pressure = curr_io_pressure - self.prev_memory_pressure = curr_memory_pressure - self.prev_pressure_time = now - else: - exceeds_cpu_pressure = self.rq.max_cpu_pressure and (float(curr_cpu_pressure) - float(self.prev_cpu_pressure)) > self.rq.max_cpu_pressure - exceeds_io_pressure = self.rq.max_io_pressure and (float(curr_io_pressure) - float(self.prev_io_pressure)) > self.rq.max_io_pressure - exceeds_memory_pressure = self.rq.max_memory_pressure and (float(curr_memory_pressure) - float(self.prev_memory_pressure)) > self.rq.max_memory_pressure + exceeds_cpu_pressure = self.rq.max_cpu_pressure and (float(curr_cpu_pressure) - float(self.prev_cpu_pressure)) / tdiff > self.rq.max_cpu_pressure + exceeds_io_pressure = self.rq.max_io_pressure and (float(curr_io_pressure) - float(self.prev_io_pressure)) / tdiff > self.rq.max_io_pressure + exceeds_memory_pressure = self.rq.max_memory_pressure and (float(curr_memory_pressure) - float(self.prev_memory_pressure)) / tdiff > self.rq.max_memory_pressure + self.prev_cpu_pressure = curr_cpu_pressure + self.prev_io_pressure = curr_io_pressure + self.prev_memory_pressure = curr_memory_pressure + self.prev_pressure_time = now + self.psi_exceeded = exceeds_cpu_pressure or exceeds_io_pressure or exceeds_memory_pressure pressure_state = (exceeds_cpu_pressure, exceeds_io_pressure, exceeds_memory_pressure) if hasattr(self, "pressure_state") and pressure_state != self.pressure_state: bb.note("Pressure status changed to CPU: %s, IO: %s, Mem: %s" % pressure_state) self.pressure_state = pressure_state - return (exceeds_cpu_pressure or exceeds_io_pressure or exceeds_memory_pressure) + return self.psi_exceeded return False def next_buildable_task(self):