[1.46] hashserv: specify loop for asyncio in python < 3.7

Message ID 20220113010527.181-1-jatedev@gmail.com
State New
Headers show
Series [1.46] hashserv: specify loop for asyncio in python < 3.7 | expand

Commit Message

Jate Sujjavanich Jan. 13, 2022, 1:05 a.m. UTC
Detect python version 3.6 and below restoring loop argument where
it is still required. In 3.7 auto loop detection is available.

Bitbake 1.46 is used in dunfell which requires a minimum python version
of 3.5. Omitting this argument leads to a regression and hang during
"Initialising tasks" at 44%.

Signed-off-by: Jate Sujjavanich <jatedev@gmail.com>
---
 lib/hashserv/server.py | 23 +++++++++++++++++------
 1 file changed, 17 insertions(+), 6 deletions(-)

Comments

Michael Opdenacker Jan. 13, 2022, 4:53 p.m. UTC | #1
Hi Jate

On 1/13/22 2:05 AM, Jate Sujjavanich wrote:
> Detect python version 3.6 and below restoring loop argument where
> it is still required. In 3.7 auto loop detection is available.
>
> Bitbake 1.46 is used in dunfell which requires a minimum python version
> of 3.5. Omitting this argument leads to a regression and hang during
> "Initialising tasks" at 44%.


Thanks for the patch!
Weirdly, though, my host (Ubuntu 20.04) has Python 3.8 and still
regularly has the "Initializing task" step stuck at 44% for a several
minutes.

Here's what strace says while this happens:
futex(0x277c200, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0,
{tv_sec=1642092729, tv_nsec=110486000}, FUTEX_BITSET_MATCH_ANY) = -1
ETIMEDOUT (Connection timed out)
futex(0x277c200, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0,
{tv_sec=1642092729, tv_nsec=360665000}, FUTEX_BITSET_MATCH_ANY) = -1
ETIMEDOUT (Connection timed out)
futex(0x277c200, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0,
{tv_sec=1642092729, tv_nsec=610862000}, FUTEX_BITSET_MATCH_ANY) = -1
ETIMEDOUT (Connection timed out)
futex(0x277c200, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0,
{tv_sec=1642092729, tv_nsec=861086000}, FUTEX_BITSET_MATCH_ANY) = -1
ETIMEDOUT (Connection timed out)
futex(0x277c200, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0,
{tv_sec=1642092730, tv_nsec=111295000}, FUTEX_BITSET_MATCH_ANY) = -1
ETIMEDOUT (Connection timed out)
...

Any clue why this can happen?

Thanks
Michael.
Jate Sujjavanich Jan. 13, 2022, 6:04 p.m. UTC | #2
Hello Michael,

I tested on an Ubuntu 20.04 docker container, and the code worked in that
environment. It has kernel 5.10.25-linuxkit which is different from the
default kernel in Ubuntu 20.04 which is 5.4.

I chose to keep the loop argument for versions 3.6 and below because python
first deprecated the loop argument in 3.7. Perhaps it's safer to keep
specifying loop for 3.9 and below. There is one other report of the bug but
it seems to be the result of the fix for 3.10 in October. So it was
probably working for python 3.9 and below.

Could you try the fix with the minor number changed to 10?

if sys.version_info[0] == 3 and sys.version_info[1] < 10:

If that works, I will update the patch. I am not familiar with the asyncio
library and its system calls, so I'm hoping this works.

Ashish: You reported behavior like this bug. Do you have more information
on your Python and build environment?


Thanks,
- Jate S.

On Thu, Jan 13, 2022 at 11:53 AM Michael Opdenacker <
michael.opdenacker@bootlin.com> wrote:

> Hi Jate
>
> On 1/13/22 2:05 AM, Jate Sujjavanich wrote:
> > Detect python version 3.6 and below restoring loop argument where
> > it is still required. In 3.7 auto loop detection is available.
> >
> > Bitbake 1.46 is used in dunfell which requires a minimum python version
> > of 3.5. Omitting this argument leads to a regression and hang during
> > "Initialising tasks" at 44%.
>
>
> Thanks for the patch!
> Weirdly, though, my host (Ubuntu 20.04) has Python 3.8 and still
> regularly has the "Initializing task" step stuck at 44% for a several
> minutes.
>
> Here's what strace says while this happens:
> futex(0x277c200, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0,
> {tv_sec=1642092729, tv_nsec=110486000}, FUTEX_BITSET_MATCH_ANY) = -1
> ETIMEDOUT (Connection timed out)
> futex(0x277c200, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0,
> {tv_sec=1642092729, tv_nsec=360665000}, FUTEX_BITSET_MATCH_ANY) = -1
> ETIMEDOUT (Connection timed out)
> futex(0x277c200, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0,
> {tv_sec=1642092729, tv_nsec=610862000}, FUTEX_BITSET_MATCH_ANY) = -1
> ETIMEDOUT (Connection timed out)
> futex(0x277c200, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0,
> {tv_sec=1642092729, tv_nsec=861086000}, FUTEX_BITSET_MATCH_ANY) = -1
> ETIMEDOUT (Connection timed out)
> futex(0x277c200, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0,
> {tv_sec=1642092730, tv_nsec=111295000}, FUTEX_BITSET_MATCH_ANY) = -1
> ETIMEDOUT (Connection timed out)
> ...
>
> Any clue why this can happen?
>
> Thanks
> Michael.
>
> --
> Michael Opdenacker, Bootlin
> Embedded Linux and Kernel engineering
> https://bootlin.com
>
>
Michael Opdenacker Jan. 14, 2022, 3:45 p.m. UTC | #3
Hi Jate

Thanks for your advice!

On 1/13/22 7:04 PM, Jate Sujjavanich wrote:
> Hello Michael,
>
> I tested on an Ubuntu 20.04 docker container, and the code worked in
> that environment. It has kernel 5.10.25-linuxkit which is different
> from the default kernel in Ubuntu 20.04 which is 5.4.
>
> I chose to keep the loop argument for versions 3.6 and below because
> python first deprecated the loop argument in 3.7. Perhaps it's safer
> to keep specifying loop for 3.9 and below. There is one other report
> of the bug but it seems to be the result of the fix for 3.10 in
> October. So it was probably working for python 3.9 and below.
>
> Could you try the fix with the minor number changed to 10?
>
> if sys.version_info[0] == 3 and sys.version_info[1] < 10:
>
> If that works, I will update the patch. I am not familiar with the
> asyncio library and its system calls, so I'm hoping this works.

Actually, I don't have the issue on BitBake 1.46 (testing on Dunfell).
Just tested.

I have it with the version in master Poky, and your patch doesn't apply.

Cheers,
Michael.
Michael Opdenacker Jan. 14, 2022, 4 p.m. UTC | #4
On 1/14/22 4:45 PM, Michael Opdenacker wrote:
>
> Actually, I don't have the issue on BitBake 1.46 (testing on Dunfell).
> Just tested.
>
> I have it with the version in master Poky, and your patch doesn't apply.


I filed a new bug about this:
https://bugzilla.yoctoproject.org/show_bug.cgi?id=14689
Cheers
Michael.

Patch

diff --git a/lib/hashserv/server.py b/lib/hashserv/server.py
index 56f354bd..04a48c39 100644
--- a/lib/hashserv/server.py
+++ b/lib/hashserv/server.py
@@ -12,6 +12,7 @@  import math
 import os
 import signal
 import socket
+import sys
 import time
 from . import chunkify, DEFAULT_MAX_CHUNK
 
@@ -419,9 +420,14 @@  class Server(object):
         self._cleanup_socket = None
 
     def start_tcp_server(self, host, port):
-        self.server = self.loop.run_until_complete(
-            asyncio.start_server(self.handle_client, host, port)
-        )
+        if sys.version_info[0] == 3 and sys.version_info[1] < 7:
+            self.server = self.loop.run_until_complete(
+                asyncio.start_server(self.handle_client, host, port, loop=self.loop)
+            )
+        else:
+            self.server = self.loop.run_until_complete(
+                asyncio.start_server(self.handle_client, host, port)
+            )
 
         for s in self.server.sockets:
             logger.info('Listening on %r' % (s.getsockname(),))
@@ -444,9 +450,14 @@  class Server(object):
         try:
             # Work around path length limits in AF_UNIX
             os.chdir(os.path.dirname(path))
-            self.server = self.loop.run_until_complete(
-                asyncio.start_unix_server(self.handle_client, os.path.basename(path))
-            )
+            if sys.version_info[0] == 3 and sys.version_info[1] < 7:
+                self.server = self.loop.run_until_complete(
+                    asyncio.start_unix_server(self.handle_client, os.path.basename(path), loop=self.loop)
+                )
+            else:
+                self.server = self.loop.run_until_complete(
+                    asyncio.start_unix_server(self.handle_client, os.path.basename(path))
+                )
         finally:
             os.chdir(cwd)