Message ID | 20220113010527.181-1-jatedev@gmail.com |
---|---|
State | New |
Headers | show |
Series | [1.46] hashserv: specify loop for asyncio in python < 3.7 | expand |
Hi Jate On 1/13/22 2:05 AM, Jate Sujjavanich wrote: > Detect python version 3.6 and below restoring loop argument where > it is still required. In 3.7 auto loop detection is available. > > Bitbake 1.46 is used in dunfell which requires a minimum python version > of 3.5. Omitting this argument leads to a regression and hang during > "Initialising tasks" at 44%. Thanks for the patch! Weirdly, though, my host (Ubuntu 20.04) has Python 3.8 and still regularly has the "Initializing task" step stuck at 44% for a several minutes. Here's what strace says while this happens: futex(0x277c200, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1642092729, tv_nsec=110486000}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out) futex(0x277c200, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1642092729, tv_nsec=360665000}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out) futex(0x277c200, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1642092729, tv_nsec=610862000}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out) futex(0x277c200, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1642092729, tv_nsec=861086000}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out) futex(0x277c200, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1642092730, tv_nsec=111295000}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out) ... Any clue why this can happen? Thanks Michael.
Hello Michael, I tested on an Ubuntu 20.04 docker container, and the code worked in that environment. It has kernel 5.10.25-linuxkit which is different from the default kernel in Ubuntu 20.04 which is 5.4. I chose to keep the loop argument for versions 3.6 and below because python first deprecated the loop argument in 3.7. Perhaps it's safer to keep specifying loop for 3.9 and below. There is one other report of the bug but it seems to be the result of the fix for 3.10 in October. So it was probably working for python 3.9 and below. Could you try the fix with the minor number changed to 10? if sys.version_info[0] == 3 and sys.version_info[1] < 10: If that works, I will update the patch. I am not familiar with the asyncio library and its system calls, so I'm hoping this works. Ashish: You reported behavior like this bug. Do you have more information on your Python and build environment? Thanks, - Jate S. On Thu, Jan 13, 2022 at 11:53 AM Michael Opdenacker < michael.opdenacker@bootlin.com> wrote: > Hi Jate > > On 1/13/22 2:05 AM, Jate Sujjavanich wrote: > > Detect python version 3.6 and below restoring loop argument where > > it is still required. In 3.7 auto loop detection is available. > > > > Bitbake 1.46 is used in dunfell which requires a minimum python version > > of 3.5. Omitting this argument leads to a regression and hang during > > "Initialising tasks" at 44%. > > > Thanks for the patch! > Weirdly, though, my host (Ubuntu 20.04) has Python 3.8 and still > regularly has the "Initializing task" step stuck at 44% for a several > minutes. > > Here's what strace says while this happens: > futex(0x277c200, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, > {tv_sec=1642092729, tv_nsec=110486000}, FUTEX_BITSET_MATCH_ANY) = -1 > ETIMEDOUT (Connection timed out) > futex(0x277c200, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, > {tv_sec=1642092729, tv_nsec=360665000}, FUTEX_BITSET_MATCH_ANY) = -1 > ETIMEDOUT (Connection timed out) > futex(0x277c200, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, > {tv_sec=1642092729, tv_nsec=610862000}, FUTEX_BITSET_MATCH_ANY) = -1 > ETIMEDOUT (Connection timed out) > futex(0x277c200, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, > {tv_sec=1642092729, tv_nsec=861086000}, FUTEX_BITSET_MATCH_ANY) = -1 > ETIMEDOUT (Connection timed out) > futex(0x277c200, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, > {tv_sec=1642092730, tv_nsec=111295000}, FUTEX_BITSET_MATCH_ANY) = -1 > ETIMEDOUT (Connection timed out) > ... > > Any clue why this can happen? > > Thanks > Michael. > > -- > Michael Opdenacker, Bootlin > Embedded Linux and Kernel engineering > https://bootlin.com > >
Hi Jate Thanks for your advice! On 1/13/22 7:04 PM, Jate Sujjavanich wrote: > Hello Michael, > > I tested on an Ubuntu 20.04 docker container, and the code worked in > that environment. It has kernel 5.10.25-linuxkit which is different > from the default kernel in Ubuntu 20.04 which is 5.4. > > I chose to keep the loop argument for versions 3.6 and below because > python first deprecated the loop argument in 3.7. Perhaps it's safer > to keep specifying loop for 3.9 and below. There is one other report > of the bug but it seems to be the result of the fix for 3.10 in > October. So it was probably working for python 3.9 and below. > > Could you try the fix with the minor number changed to 10? > > if sys.version_info[0] == 3 and sys.version_info[1] < 10: > > If that works, I will update the patch. I am not familiar with the > asyncio library and its system calls, so I'm hoping this works. Actually, I don't have the issue on BitBake 1.46 (testing on Dunfell). Just tested. I have it with the version in master Poky, and your patch doesn't apply. Cheers, Michael.
On 1/14/22 4:45 PM, Michael Opdenacker wrote: > > Actually, I don't have the issue on BitBake 1.46 (testing on Dunfell). > Just tested. > > I have it with the version in master Poky, and your patch doesn't apply. I filed a new bug about this: https://bugzilla.yoctoproject.org/show_bug.cgi?id=14689 Cheers Michael.
diff --git a/lib/hashserv/server.py b/lib/hashserv/server.py index 56f354bd..04a48c39 100644 --- a/lib/hashserv/server.py +++ b/lib/hashserv/server.py @@ -12,6 +12,7 @@ import math import os import signal import socket +import sys import time from . import chunkify, DEFAULT_MAX_CHUNK @@ -419,9 +420,14 @@ class Server(object): self._cleanup_socket = None def start_tcp_server(self, host, port): - self.server = self.loop.run_until_complete( - asyncio.start_server(self.handle_client, host, port) - ) + if sys.version_info[0] == 3 and sys.version_info[1] < 7: + self.server = self.loop.run_until_complete( + asyncio.start_server(self.handle_client, host, port, loop=self.loop) + ) + else: + self.server = self.loop.run_until_complete( + asyncio.start_server(self.handle_client, host, port) + ) for s in self.server.sockets: logger.info('Listening on %r' % (s.getsockname(),)) @@ -444,9 +450,14 @@ class Server(object): try: # Work around path length limits in AF_UNIX os.chdir(os.path.dirname(path)) - self.server = self.loop.run_until_complete( - asyncio.start_unix_server(self.handle_client, os.path.basename(path)) - ) + if sys.version_info[0] == 3 and sys.version_info[1] < 7: + self.server = self.loop.run_until_complete( + asyncio.start_unix_server(self.handle_client, os.path.basename(path), loop=self.loop) + ) + else: + self.server = self.loop.run_until_complete( + asyncio.start_unix_server(self.handle_client, os.path.basename(path)) + ) finally: os.chdir(cwd)
Detect python version 3.6 and below restoring loop argument where it is still required. In 3.7 auto loop detection is available. Bitbake 1.46 is used in dunfell which requires a minimum python version of 3.5. Omitting this argument leads to a regression and hang during "Initialising tasks" at 44%. Signed-off-by: Jate Sujjavanich <jatedev@gmail.com> --- lib/hashserv/server.py | 23 +++++++++++++++++------ 1 file changed, 17 insertions(+), 6 deletions(-)