diff mbox series

bitbake-worker: Fix silent hang issue caused by unexpected stdout content

Message ID 20240202093602.25618-1-yang.xu@mediatek.com
State Accepted, archived
Commit 08f3e677d6af27a41a918aaa9da9c1c9b20a0b95
Headers show
Series bitbake-worker: Fix silent hang issue caused by unexpected stdout content | expand

Commit Message

Yang Xu Feb. 2, 2024, 9:36 a.m. UTC
From: Yang Xu <yang.xu@mediatek.com>

This patch addresses an issue in bitbake-worker where stdout,
reserved for status reporting, is improperly accessed by child processes.

The problem occurs during the execution of parseRecipe,
which calls anonymous functions. If these functions use print-like operations,
they can inadvertently output data to stdout. This unexpected data can cause
the runqueue to hang silently, if the stdout buffer is flushed
before exec_task is executed.

To prevent this, the patch redirects stdout to /dev/null and ensures it is
flushed prior to the execution of exec_task.

Signed-off-by: Yang Xu <yang.xu@mediatek.com>
---
 bin/bitbake-worker | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

Comments

Richard Purdie Feb. 4, 2024, 11:01 p.m. UTC | #1
On Fri, 2024-02-02 at 09:36 +0000, Yang Xu via lists.openembedded.org
wrote:
> From: Yang Xu <yang.xu@mediatek.com>
> 
> This patch addresses an issue in bitbake-worker where stdout,
> reserved for status reporting, is improperly accessed by child processes.
> 
> The problem occurs during the execution of parseRecipe,
> which calls anonymous functions. If these functions use print-like operations,
> they can inadvertently output data to stdout. This unexpected data can cause
> the runqueue to hang silently, if the stdout buffer is flushed
> before exec_task is executed.
> 
> To prevent this, the patch redirects stdout to /dev/null and ensures it is
> flushed prior to the execution of exec_task.
> 
> Signed-off-by: Yang Xu <yang.xu@mediatek.com>
> ---
>  bin/bitbake-worker | 12 +++++++++---
>  1 file changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/bin/bitbake-worker b/bin/bitbake-worker
> index eba9c562..0ba18572 100755
> --- a/bin/bitbake-worker
> +++ b/bin/bitbake-worker
> @@ -237,9 +237,11 @@ def fork_off_task(cfg, data, databuilder, workerdata, extraconfigdata, runtask):
>              # Let SIGHUP exit as SIGTERM
>              signal.signal(signal.SIGHUP, sigterm_handler)
>  
> -            # No stdin
> -            newsi = os.open(os.devnull, os.O_RDWR)
> -            os.dup2(newsi, sys.stdin.fileno())
> +            # No stdin & stdout
> +            # stdout is used as a status report channel and must not be used by child processes.
> +            dumbio = os.open(os.devnull, os.O_RDWR)
> +            os.dup2(dumbio, sys.stdin.fileno())
> +            os.dup2(dumbio, sys.stdout.fileno())
>  
>              if umask:
>                  os.umask(umask)
> @@ -305,6 +307,10 @@ def fork_off_task(cfg, data, databuilder, workerdata, extraconfigdata, runtask):
>                  if not quieterrors:
>                      logger.critical(traceback.format_exc())
>                  os._exit(1)
> +
> +            sys.stdout.flush()
> +            sys.stderr.flush()
> +
>              try:
>                  if dry_run:
>                      return 0

This looks like a good catch. 

I'm wondering if we should change the code to use a different file
descriptor for the communication and leave stdout/stderr alone
instead...

Cheers,

Richard
Yang Xu Feb. 22, 2024, 6:08 a.m. UTC | #2
Sorry, I noticed this patch has not been merged. Is there any further work that I need to complete?
Or is bitbake planning to swtich to the solution that does not use stdin/stdout directly.

Thank you

On Sun, 2024-02-04 at 23:01 +0000, Richard Purdie wrote:

> This patch addresses an issue in bitbake-worker where stdout,

> reserved for status reporting, is improperly accessed by child processes.

>

> The problem occurs during the execution of parseRecipe,

> which calls anonymous functions. If these functions use print-like operations,

> they can inadvertently output data to stdout. This unexpected data can cause

> the runqueue to hang silently, if the stdout buffer is flushed

> before exec_task is executed.

>

> To prevent this, the patch redirects stdout to /dev/null and ensures it is

> flushed prior to the execution of exec_task.

>

> Signed-off-by: Yang Xu <yang.xu@mediatek.com>

> ---

>  bin/bitbake-worker | 12 +++++++++---

>  1 file changed, 9 insertions(+), 3 deletions(-)

>

> diff --git a/bin/bitbake-worker b/bin/bitbake-worker

> index eba9c562..0ba18572 100755

> --- a/bin/bitbake-worker

> +++ b/bin/bitbake-worker

> @@ -237,9 +237,11 @@ def fork_off_task(cfg, data, databuilder, workerdata, extraconfigdata, runtask):

>              # Let SIGHUP exit as SIGTERM

>              signal.signal(signal.SIGHUP, sigterm_handler)

>

> -            # No stdin

> -            newsi = os.open(os.devnull, os.O_RDWR)

> -            os.dup2(newsi, sys.stdin.fileno())

> +            # No stdin & stdout

> +            # stdout is used as a status report channel and must not be used by child processes.

> +            dumbio = os.open(os.devnull, os.O_RDWR)

> +            os.dup2(dumbio, sys.stdin.fileno())

> +            os.dup2(dumbio, sys.stdout.fileno())

>

>              if umask:

>                  os.umask(umask)

> @@ -305,6 +307,10 @@ def fork_off_task(cfg, data, databuilder, workerdata, extraconfigdata, runtask):

>                  if not quieterrors:

>                      logger.critical(traceback.format_exc())

>                  os._exit(1)

> +

> +            sys.stdout.flush()

> +            sys.stderr.flush()

> +

>              try:

>                  if dry_run:

>                      return 0


This looks like a good catch.


I'm wondering if we should change the code to use a different file

descriptor for the communication and leave stdout/stderr alone

instead...


Cheers,


Richard
diff mbox series

Patch

diff --git a/bin/bitbake-worker b/bin/bitbake-worker
index eba9c562..0ba18572 100755
--- a/bin/bitbake-worker
+++ b/bin/bitbake-worker
@@ -237,9 +237,11 @@  def fork_off_task(cfg, data, databuilder, workerdata, extraconfigdata, runtask):
             # Let SIGHUP exit as SIGTERM
             signal.signal(signal.SIGHUP, sigterm_handler)
 
-            # No stdin
-            newsi = os.open(os.devnull, os.O_RDWR)
-            os.dup2(newsi, sys.stdin.fileno())
+            # No stdin & stdout
+            # stdout is used as a status report channel and must not be used by child processes.
+            dumbio = os.open(os.devnull, os.O_RDWR)
+            os.dup2(dumbio, sys.stdin.fileno())
+            os.dup2(dumbio, sys.stdout.fileno())
 
             if umask:
                 os.umask(umask)
@@ -305,6 +307,10 @@  def fork_off_task(cfg, data, databuilder, workerdata, extraconfigdata, runtask):
                 if not quieterrors:
                     logger.critical(traceback.format_exc())
                 os._exit(1)
+
+            sys.stdout.flush()
+            sys.stderr.flush()
+
             try:
                 if dry_run:
                     return 0