About the sstate cache directory hierarchy

Message ID fd53e270-fe25-c1ad-522a-8444ca8ba642@windriver.com
State New
Headers show
Series About the sstate cache directory hierarchy | expand

Commit Message

ChenQi Jan. 28, 2022, 7:06 a.m. UTC
Hi All,

I'm sending out this email because I'm wondering if we can change the 
sstate cache directory to use ${PN} and taskname as subditories. Hope to 
hear your opinions.

Below is the long story.

Recently I noticed that running `bitbake xxx -c cleansstate' usually 
takes more than 10 minutes.

After some investigation, I can see that most of the time is spent on 
file searching. This is because we have:

SSTATE_PATHSPEC   = 
"${SSTATE_DIR}/${SSTATE_EXTRAPATHWILDCARD}${PN}/${SSTATE_PATH_CURRTASK}/${SSTATE_PKGSPEC}*_${SSTATE_PATH_CURRTASK}.tar.zst*"

And our sstate cache directory's hierarchy uses hash[:2]/hash[2:4]/ as 
sub-directories.

This essentially means that all sub-directories are searched. This would 
take a long time, especially when run for the first time. I made some 
changes to  output the time and the logs are as below.

$ bitbake glibc -c cleansstate
WARNING: glibc-2.34-r0 do_cleansstate: Removing 
/ala-lpggp72/qichen/LAT/builds/share/sstate-cache/*/*/sstate:glibc:core2-64-poky-linux:2.34:r0:core2-64:7:*_deploy_source_date_epoch.tar.zst*
WARNING: glibc-2.34-r0 do_cleansstate: Took 611.8865714073181 seconds
WARNING: glibc-2.34-r0 do_cleansstate: Removing 
/ala-lpggp72/qichen/LAT/builds/share/sstate-cache/*/*/sstate:glibc:core2-64-poky-linux:2.34:r0:core2-64:7:*_package.tar.zst*
WARNING: glibc-2.34-r0 do_cleansstate: Took 1.3219327926635742 seconds
WARNING: glibc-2.34-r0 do_cleansstate: Removing 
/ala-lpggp72/qichen/LAT/builds/share/sstate-cache/*/*/sstate:glibc:core2-64-poky-linux:2.34:r0:core2-64:7:*_package_qa.tar.zst*
WARNING: glibc-2.34-r0 do_cleansstate: Took 1.470815658569336 seconds
WARNING: glibc-2.34-r0 do_cleansstate: Removing 
/ala-lpggp72/qichen/LAT/builds/share/sstate-cache/*/*/sstate:glibc:core2-64-poky-linux:2.34:r0:core2-64:7:*_package_write_rpm.tar.zst*
WARNING: glibc-2.34-r0 do_cleansstate: Took 1.251939058303833 seconds
WARNING: glibc-2.34-r0 do_cleansstate: Removing 
/ala-lpggp72/qichen/LAT/builds/share/sstate-cache/*/*/sstate:glibc:core2-64-poky-linux:2.34:r0:core2-64:7:*_packagedata.tar.zst*
WARNING: glibc-2.34-r0 do_cleansstate: Took 1.2369801998138428 seconds
WARNING: glibc-2.34-r0 do_cleansstate: Removing 
/ala-lpggp72/qichen/LAT/builds/share/sstate-cache/*/*/sstate:glibc::2.34:r0::7:*_populate_lic.tar.zst*
WARNING: glibc-2.34-r0 do_cleansstate: Took 1.1668426990509033 seconds
WARNING: glibc-2.34-r0 do_cleansstate: Removing 
/ala-lpggp72/qichen/LAT/builds/share/sstate-cache/*/*/sstate:glibc:core2-64-poky-linux:2.34:r0:core2-64:7:*_populate_sysroot.tar.zst*
WARNING: glibc-2.34-r0 do_cleansstate: Took 1.385568380355835 seconds
WARNING: glibc-2.34-r0 do_cleansstate: Removing 
/ala-lpggp72/qichen/LAT/builds/share/sstate-cache/*/*/sstate:glibc:core2-64-poky-linux:2.34:r0:core2-64:7:*_stash_locale.tar.zst*
WARNING: glibc-2.34-r0 do_cleansstate: Took 1.4884181022644043 seconds

I figured that unlike git, we do have knowledge on our sstate objects. 
It does not seem necessary to use hash value as sub directory. So I 
changed the sstate directory hierarchy to use ${PN}/taskname/ as sub 
directories, and here's the result.

$ bitbake libgcc -c cleansstate

WARNING: libgcc-11.2.0-r0 do_cleansstate: Removing 
/ala-lpggp72/qichen/LAT/builds/share/sstate-cache/libgcc/deploy_source_date_epoch/sstate:libgcc:core2-64-poky-linux:11.2.0:r0:core2-64:7:*_deploy_source_date_epoch.tar.zst*
WARNING: libgcc-11.2.0-r0 do_cleansstate: Took 0.020630598068237305 seconds
WARNING: libgcc-11.2.0-r0 do_cleansstate: Removing 
/ala-lpggp72/qichen/LAT/builds/share/sstate-cache/libgcc/package/sstate:libgcc:core2-64-poky-linux:11.2.0:r0:core2-64:7:*_package.tar.zst*
WARNING: libgcc-11.2.0-r0 do_cleansstate: Took 0.0011608600616455078 seconds
WARNING: libgcc-11.2.0-r0 do_cleansstate: Removing 
/ala-lpggp72/qichen/LAT/builds/share/sstate-cache/libgcc/package_qa/sstate:libgcc:core2-64-poky-linux:11.2.0:r0:core2-64:7:*_package_qa.tar.zst*
WARNING: libgcc-11.2.0-r0 do_cleansstate: Took 0.0007557868957519531 seconds
WARNING: libgcc-11.2.0-r0 do_cleansstate: Removing 
/ala-lpggp72/qichen/LAT/builds/share/sstate-cache/libgcc/package_write_rpm/sstate:libgcc:core2-64-poky-linux:11.2.0:r0:core2-64:7:*_package_write_rpm.tar.zst*
WARNING: libgcc-11.2.0-r0 do_cleansstate: Took 0.0013995170593261719 seconds
WARNING: libgcc-11.2.0-r0 do_cleansstate: Removing 
/ala-lpggp72/qichen/LAT/builds/share/sstate-cache/libgcc/packagedata/sstate:libgcc:core2-64-poky-linux:11.2.0:r0:core2-64:7:*_packagedata.tar.zst*
WARNING: libgcc-11.2.0-r0 do_cleansstate: Took 0.0007488727569580078 seconds
WARNING: libgcc-11.2.0-r0 do_cleansstate: Removing 
/ala-lpggp72/qichen/LAT/builds/share/sstate-cache/libgcc/populate_lic/sstate:libgcc::11.2.0:r0::7:*_populate_lic.tar.zst*
WARNING: libgcc-11.2.0-r0 do_cleansstate: Took 0.0005896091461181641 seconds
WARNING: libgcc-11.2.0-r0 do_cleansstate: Removing 
/ala-lpggp72/qichen/LAT/builds/share/sstate-cache/libgcc/populate_sysroot/sstate:libgcc:core2-64-poky-linux:11.2.0:r0:core2-64:7:*_populate_sysroot.tar.zst*
WARNING: libgcc-11.2.0-r0 do_cleansstate: Took 0.00080108642578125 seconds

It's much faster.

In addition, the sub dirs now give more info, which should potentially 
make sstate cache easier to manage.

Attached is the patch to quickly try things out. Hope to hear your opinions.

Best Regards,

Chen Qi

Comments

Alexander Kanavin Jan. 28, 2022, 10:29 a.m. UTC | #1
I do like the idea; not everyone has a pcie gen4/5 ssd for builds, or
rigorously trims sstate on a schedule. But there may be consequences or
regressions, maybe RP will immediately shoot it down :)

I would however still place a single level of hash[:2] *under* the
pn/task/, to avoid too many files piling up in a single directory.

Can you send this as an RFC patch?

Alex

On Fri, 28 Jan 2022 at 08:06, Chen Qi <Qi.Chen@windriver.com> wrote:

> Hi All,
>
> I'm sending out this email because I'm wondering if we can change the
> sstate cache directory to use ${PN} and taskname as subditories. Hope to
> hear your opinions.
>
> Below is the long story.
>
> Recently I noticed that running `bitbake xxx -c cleansstate' usually
> takes more than 10 minutes.
>
> After some investigation, I can see that most of the time is spent on
> file searching. This is because we have:
>
> SSTATE_PATHSPEC   =
>
> "${SSTATE_DIR}/${SSTATE_EXTRAPATHWILDCARD}${PN}/${SSTATE_PATH_CURRTASK}/${SSTATE_PKGSPEC}*_${SSTATE_PATH_CURRTASK}.tar.zst*"
>
> And our sstate cache directory's hierarchy uses hash[:2]/hash[2:4]/ as
> sub-directories.
>
> This essentially means that all sub-directories are searched. This would
> take a long time, especially when run for the first time. I made some
> changes to  output the time and the logs are as below.
>
> $ bitbake glibc -c cleansstate
> WARNING: glibc-2.34-r0 do_cleansstate: Removing
>
> /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/*/*/sstate:glibc:core2-64-poky-linux:2.34:r0:core2-64:7:*_deploy_source_date_epoch.tar.zst*
> WARNING: glibc-2.34-r0 do_cleansstate: Took 611.8865714073181 seconds
> WARNING: glibc-2.34-r0 do_cleansstate: Removing
>
> /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/*/*/sstate:glibc:core2-64-poky-linux:2.34:r0:core2-64:7:*_package.tar.zst*
> WARNING: glibc-2.34-r0 do_cleansstate: Took 1.3219327926635742 seconds
> WARNING: glibc-2.34-r0 do_cleansstate: Removing
>
> /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/*/*/sstate:glibc:core2-64-poky-linux:2.34:r0:core2-64:7:*_package_qa.tar.zst*
> WARNING: glibc-2.34-r0 do_cleansstate: Took 1.470815658569336 seconds
> WARNING: glibc-2.34-r0 do_cleansstate: Removing
>
> /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/*/*/sstate:glibc:core2-64-poky-linux:2.34:r0:core2-64:7:*_package_write_rpm.tar.zst*
> WARNING: glibc-2.34-r0 do_cleansstate: Took 1.251939058303833 seconds
> WARNING: glibc-2.34-r0 do_cleansstate: Removing
>
> /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/*/*/sstate:glibc:core2-64-poky-linux:2.34:r0:core2-64:7:*_packagedata.tar.zst*
> WARNING: glibc-2.34-r0 do_cleansstate: Took 1.2369801998138428 seconds
> WARNING: glibc-2.34-r0 do_cleansstate: Removing
>
> /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/*/*/sstate:glibc::2.34:r0::7:*_populate_lic.tar.zst*
> WARNING: glibc-2.34-r0 do_cleansstate: Took 1.1668426990509033 seconds
> WARNING: glibc-2.34-r0 do_cleansstate: Removing
>
> /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/*/*/sstate:glibc:core2-64-poky-linux:2.34:r0:core2-64:7:*_populate_sysroot.tar.zst*
> WARNING: glibc-2.34-r0 do_cleansstate: Took 1.385568380355835 seconds
> WARNING: glibc-2.34-r0 do_cleansstate: Removing
>
> /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/*/*/sstate:glibc:core2-64-poky-linux:2.34:r0:core2-64:7:*_stash_locale.tar.zst*
> WARNING: glibc-2.34-r0 do_cleansstate: Took 1.4884181022644043 seconds
>
> I figured that unlike git, we do have knowledge on our sstate objects.
> It does not seem necessary to use hash value as sub directory. So I
> changed the sstate directory hierarchy to use ${PN}/taskname/ as sub
> directories, and here's the result.
>
> $ bitbake libgcc -c cleansstate
>
> WARNING: libgcc-11.2.0-r0 do_cleansstate: Removing
>
> /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/libgcc/deploy_source_date_epoch/sstate:libgcc:core2-64-poky-linux:11.2.0:r0:core2-64:7:*_deploy_source_date_epoch.tar.zst*
> WARNING: libgcc-11.2.0-r0 do_cleansstate: Took 0.020630598068237305 seconds
> WARNING: libgcc-11.2.0-r0 do_cleansstate: Removing
>
> /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/libgcc/package/sstate:libgcc:core2-64-poky-linux:11.2.0:r0:core2-64:7:*_package.tar.zst*
> WARNING: libgcc-11.2.0-r0 do_cleansstate: Took 0.0011608600616455078
> seconds
> WARNING: libgcc-11.2.0-r0 do_cleansstate: Removing
>
> /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/libgcc/package_qa/sstate:libgcc:core2-64-poky-linux:11.2.0:r0:core2-64:7:*_package_qa.tar.zst*
> WARNING: libgcc-11.2.0-r0 do_cleansstate: Took 0.0007557868957519531
> seconds
> WARNING: libgcc-11.2.0-r0 do_cleansstate: Removing
>
> /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/libgcc/package_write_rpm/sstate:libgcc:core2-64-poky-linux:11.2.0:r0:core2-64:7:*_package_write_rpm.tar.zst*
> WARNING: libgcc-11.2.0-r0 do_cleansstate: Took 0.0013995170593261719
> seconds
> WARNING: libgcc-11.2.0-r0 do_cleansstate: Removing
>
> /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/libgcc/packagedata/sstate:libgcc:core2-64-poky-linux:11.2.0:r0:core2-64:7:*_packagedata.tar.zst*
> WARNING: libgcc-11.2.0-r0 do_cleansstate: Took 0.0007488727569580078
> seconds
> WARNING: libgcc-11.2.0-r0 do_cleansstate: Removing
>
> /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/libgcc/populate_lic/sstate:libgcc::11.2.0:r0::7:*_populate_lic.tar.zst*
> WARNING: libgcc-11.2.0-r0 do_cleansstate: Took 0.0005896091461181641
> seconds
> WARNING: libgcc-11.2.0-r0 do_cleansstate: Removing
>
> /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/libgcc/populate_sysroot/sstate:libgcc:core2-64-poky-linux:11.2.0:r0:core2-64:7:*_populate_sysroot.tar.zst*
> WARNING: libgcc-11.2.0-r0 do_cleansstate: Took 0.00080108642578125 seconds
>
> It's much faster.
>
> In addition, the sub dirs now give more info, which should potentially
> make sstate cache easier to manage.
>
> Attached is the patch to quickly try things out. Hope to hear your
> opinions.
>
> Best Regards,
>
> Chen Qi
>
>
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
> View/Reply Online (#161067):
> https://lists.openembedded.org/g/openembedded-core/message/161067
> Mute This Topic: https://lists.openembedded.org/mt/88740516/1686489
> Group Owner: openembedded-core+owner@lists.openembedded.org
> Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub [
> alex.kanavin@gmail.com]
> -=-=-=-=-=-=-=-=-=-=-=-
>
>
ChenQi Feb. 7, 2022, 7:37 a.m. UTC | #2
On 1/28/22 6:29 PM, Alexander Kanavin wrote:
> I do like the idea; not everyone has a pcie gen4/5 ssd for builds, or 
> rigorously trims sstate on a schedule. But there may be consequences 
> or regressions, maybe RP will immediately shoot it down :)
>
> I would however still place a single level of hash[:2] *under* the 
> pn/task/, to avoid too many files piling up in a single directory.
>
> Can you send this as an RFC patch?
>
> Alex

Yes. I will ensure that all oe selftests pass before sending out patch.

Regards,

Qi


>
> On Fri, 28 Jan 2022 at 08:06, Chen Qi <Qi.Chen@windriver.com 
> <mailto:Qi.Chen@windriver.com>> wrote:
>
>     Hi All,
>
>     I'm sending out this email because I'm wondering if we can change the
>     sstate cache directory to use ${PN} and taskname as subditories.
>     Hope to
>     hear your opinions.
>
>     Below is the long story.
>
>     Recently I noticed that running `bitbake xxx -c cleansstate' usually
>     takes more than 10 minutes.
>
>     After some investigation, I can see that most of the time is spent on
>     file searching. This is because we have:
>
>     SSTATE_PATHSPEC   =
>     "${SSTATE_DIR}/${SSTATE_EXTRAPATHWILDCARD}${PN}/${SSTATE_PATH_CURRTASK}/${SSTATE_PKGSPEC}*_${SSTATE_PATH_CURRTASK}.tar.zst*"
>
>     And our sstate cache directory's hierarchy uses
>     hash[:2]/hash[2:4]/ as
>     sub-directories.
>
>     This essentially means that all sub-directories are searched. This
>     would
>     take a long time, especially when run for the first time. I made some
>     changes to  output the time and the logs are as below.
>
>     $ bitbake glibc -c cleansstate
>     WARNING: glibc-2.34-r0 do_cleansstate: Removing
>     /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/*/*/sstate:glibc:core2-64-poky-linux:2.34:r0:core2-64:7:*_deploy_source_date_epoch.tar.zst*
>     WARNING: glibc-2.34-r0 do_cleansstate: Took 611.8865714073181 seconds
>     WARNING: glibc-2.34-r0 do_cleansstate: Removing
>     /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/*/*/sstate:glibc:core2-64-poky-linux:2.34:r0:core2-64:7:*_package.tar.zst*
>     WARNING: glibc-2.34-r0 do_cleansstate: Took 1.3219327926635742 seconds
>     WARNING: glibc-2.34-r0 do_cleansstate: Removing
>     /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/*/*/sstate:glibc:core2-64-poky-linux:2.34:r0:core2-64:7:*_package_qa.tar.zst*
>     WARNING: glibc-2.34-r0 do_cleansstate: Took 1.470815658569336 seconds
>     WARNING: glibc-2.34-r0 do_cleansstate: Removing
>     /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/*/*/sstate:glibc:core2-64-poky-linux:2.34:r0:core2-64:7:*_package_write_rpm.tar.zst*
>     WARNING: glibc-2.34-r0 do_cleansstate: Took 1.251939058303833 seconds
>     WARNING: glibc-2.34-r0 do_cleansstate: Removing
>     /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/*/*/sstate:glibc:core2-64-poky-linux:2.34:r0:core2-64:7:*_packagedata.tar.zst*
>     WARNING: glibc-2.34-r0 do_cleansstate: Took 1.2369801998138428 seconds
>     WARNING: glibc-2.34-r0 do_cleansstate: Removing
>     /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/*/*/sstate:glibc::2.34:r0::7:*_populate_lic.tar.zst*
>     WARNING: glibc-2.34-r0 do_cleansstate: Took 1.1668426990509033 seconds
>     WARNING: glibc-2.34-r0 do_cleansstate: Removing
>     /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/*/*/sstate:glibc:core2-64-poky-linux:2.34:r0:core2-64:7:*_populate_sysroot.tar.zst*
>     WARNING: glibc-2.34-r0 do_cleansstate: Took 1.385568380355835 seconds
>     WARNING: glibc-2.34-r0 do_cleansstate: Removing
>     /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/*/*/sstate:glibc:core2-64-poky-linux:2.34:r0:core2-64:7:*_stash_locale.tar.zst*
>     WARNING: glibc-2.34-r0 do_cleansstate: Took 1.4884181022644043 seconds
>
>     I figured that unlike git, we do have knowledge on our sstate
>     objects.
>     It does not seem necessary to use hash value as sub directory. So I
>     changed the sstate directory hierarchy to use ${PN}/taskname/ as sub
>     directories, and here's the result.
>
>     $ bitbake libgcc -c cleansstate
>
>     WARNING: libgcc-11.2.0-r0 do_cleansstate: Removing
>     /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/libgcc/deploy_source_date_epoch/sstate:libgcc:core2-64-poky-linux:11.2.0:r0:core2-64:7:*_deploy_source_date_epoch.tar.zst*
>     WARNING: libgcc-11.2.0-r0 do_cleansstate: Took
>     0.020630598068237305 seconds
>     WARNING: libgcc-11.2.0-r0 do_cleansstate: Removing
>     /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/libgcc/package/sstate:libgcc:core2-64-poky-linux:11.2.0:r0:core2-64:7:*_package.tar.zst*
>     WARNING: libgcc-11.2.0-r0 do_cleansstate: Took
>     0.0011608600616455078 seconds
>     WARNING: libgcc-11.2.0-r0 do_cleansstate: Removing
>     /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/libgcc/package_qa/sstate:libgcc:core2-64-poky-linux:11.2.0:r0:core2-64:7:*_package_qa.tar.zst*
>     WARNING: libgcc-11.2.0-r0 do_cleansstate: Took
>     0.0007557868957519531 seconds
>     WARNING: libgcc-11.2.0-r0 do_cleansstate: Removing
>     /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/libgcc/package_write_rpm/sstate:libgcc:core2-64-poky-linux:11.2.0:r0:core2-64:7:*_package_write_rpm.tar.zst*
>     WARNING: libgcc-11.2.0-r0 do_cleansstate: Took
>     0.0013995170593261719 seconds
>     WARNING: libgcc-11.2.0-r0 do_cleansstate: Removing
>     /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/libgcc/packagedata/sstate:libgcc:core2-64-poky-linux:11.2.0:r0:core2-64:7:*_packagedata.tar.zst*
>     WARNING: libgcc-11.2.0-r0 do_cleansstate: Took
>     0.0007488727569580078 seconds
>     WARNING: libgcc-11.2.0-r0 do_cleansstate: Removing
>     /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/libgcc/populate_lic/sstate:libgcc::11.2.0:r0::7:*_populate_lic.tar.zst*
>     WARNING: libgcc-11.2.0-r0 do_cleansstate: Took
>     0.0005896091461181641 seconds
>     WARNING: libgcc-11.2.0-r0 do_cleansstate: Removing
>     /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/libgcc/populate_sysroot/sstate:libgcc:core2-64-poky-linux:11.2.0:r0:core2-64:7:*_populate_sysroot.tar.zst*
>     WARNING: libgcc-11.2.0-r0 do_cleansstate: Took 0.00080108642578125
>     seconds
>
>     It's much faster.
>
>     In addition, the sub dirs now give more info, which should
>     potentially
>     make sstate cache easier to manage.
>
>     Attached is the patch to quickly try things out. Hope to hear your
>     opinions.
>
>     Best Regards,
>
>     Chen Qi
>
>
>     -=-=-=-=-=-=-=-=-=-=-=-
>     Links: You receive all messages sent to this group.
>     View/Reply Online (#161067):
>     https://lists.openembedded.org/g/openembedded-core/message/161067
>     <https://urldefense.com/v3/__https://lists.openembedded.org/g/openembedded-core/message/161067__;!!AjveYdw8EvQ!N5Dyvu4eenTGK1TY1BrreaZ7uMlcahVqvjTbg01fXC5Sptlu1wS-nKbk5PGtw6ODQA$>
>     Mute This Topic:
>     https://lists.openembedded.org/mt/88740516/1686489
>     <https://urldefense.com/v3/__https://lists.openembedded.org/mt/88740516/1686489__;!!AjveYdw8EvQ!N5Dyvu4eenTGK1TY1BrreaZ7uMlcahVqvjTbg01fXC5Sptlu1wS-nKbk5PGWaXA87A$>
>     Group Owner: openembedded-core+owner@lists.openembedded.org
>     <mailto:openembedded-core%2Bowner@lists.openembedded.org>
>     Unsubscribe:
>     https://lists.openembedded.org/g/openembedded-core/unsub
>     <https://urldefense.com/v3/__https://lists.openembedded.org/g/openembedded-core/unsub__;!!AjveYdw8EvQ!N5Dyvu4eenTGK1TY1BrreaZ7uMlcahVqvjTbg01fXC5Sptlu1wS-nKbk5PEkTurZng$>
>     [alex.kanavin@gmail.com <mailto:alex.kanavin@gmail.com>]
>     -=-=-=-=-=-=-=-=-=-=-=-
>
Jacob Kroon May 16, 2022, 5:30 a.m. UTC | #3
On 1/28/22 08:06, Chen Qi wrote:
> Hi All,
> 
> I'm sending out this email because I'm wondering if we can change the 
> sstate cache directory to use ${PN} and taskname as subditories. Hope to 
> hear your opinions.
> 
> Below is the long story.
> 
> Recently I noticed that running `bitbake xxx -c cleansstate' usually 
> takes more than 10 minutes.
> 
> After some investigation, I can see that most of the time is spent on 
> file searching. This is because we have:
> 
> SSTATE_PATHSPEC   = 
> "${SSTATE_DIR}/${SSTATE_EXTRAPATHWILDCARD}${PN}/${SSTATE_PATH_CURRTASK}/${SSTATE_PKGSPEC}*_${SSTATE_PATH_CURRTASK}.tar.zst*" 
> 
> 
> And our sstate cache directory's hierarchy uses hash[:2]/hash[2:4]/ as 
> sub-directories.
> 
> This essentially means that all sub-directories are searched. This would 
> take a long time, especially when run for the first time. I made some 
> changes to  output the time and the logs are as below.
> 
> $ bitbake glibc -c cleansstate
> WARNING: glibc-2.34-r0 do_cleansstate: Removing 
> /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/*/*/sstate:glibc:core2-64-poky-linux:2.34:r0:core2-64:7:*_deploy_source_date_epoch.tar.zst* 
> 
> WARNING: glibc-2.34-r0 do_cleansstate: Took 611.8865714073181 seconds
> WARNING: glibc-2.34-r0 do_cleansstate: Removing 
> /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/*/*/sstate:glibc:core2-64-poky-linux:2.34:r0:core2-64:7:*_package.tar.zst* 
> 
> WARNING: glibc-2.34-r0 do_cleansstate: Took 1.3219327926635742 seconds
> WARNING: glibc-2.34-r0 do_cleansstate: Removing 
> /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/*/*/sstate:glibc:core2-64-poky-linux:2.34:r0:core2-64:7:*_package_qa.tar.zst* 
> 
> WARNING: glibc-2.34-r0 do_cleansstate: Took 1.470815658569336 seconds
> WARNING: glibc-2.34-r0 do_cleansstate: Removing 
> /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/*/*/sstate:glibc:core2-64-poky-linux:2.34:r0:core2-64:7:*_package_write_rpm.tar.zst* 
> 
> WARNING: glibc-2.34-r0 do_cleansstate: Took 1.251939058303833 seconds
> WARNING: glibc-2.34-r0 do_cleansstate: Removing 
> /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/*/*/sstate:glibc:core2-64-poky-linux:2.34:r0:core2-64:7:*_packagedata.tar.zst* 
> 
> WARNING: glibc-2.34-r0 do_cleansstate: Took 1.2369801998138428 seconds
> WARNING: glibc-2.34-r0 do_cleansstate: Removing 
> /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/*/*/sstate:glibc::2.34:r0::7:*_populate_lic.tar.zst* 
> 
> WARNING: glibc-2.34-r0 do_cleansstate: Took 1.1668426990509033 seconds
> WARNING: glibc-2.34-r0 do_cleansstate: Removing 
> /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/*/*/sstate:glibc:core2-64-poky-linux:2.34:r0:core2-64:7:*_populate_sysroot.tar.zst* 
> 
> WARNING: glibc-2.34-r0 do_cleansstate: Took 1.385568380355835 seconds
> WARNING: glibc-2.34-r0 do_cleansstate: Removing 
> /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/*/*/sstate:glibc:core2-64-poky-linux:2.34:r0:core2-64:7:*_stash_locale.tar.zst* 
> 
> WARNING: glibc-2.34-r0 do_cleansstate: Took 1.4884181022644043 seconds
> 
> I figured that unlike git, we do have knowledge on our sstate objects. 
> It does not seem necessary to use hash value as sub directory. So I 
> changed the sstate directory hierarchy to use ${PN}/taskname/ as sub 
> directories, and here's the result.
> 
> $ bitbake libgcc -c cleansstate
> 
> WARNING: libgcc-11.2.0-r0 do_cleansstate: Removing 
> /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/libgcc/deploy_source_date_epoch/sstate:libgcc:core2-64-poky-linux:11.2.0:r0:core2-64:7:*_deploy_source_date_epoch.tar.zst* 
> 
> WARNING: libgcc-11.2.0-r0 do_cleansstate: Took 0.020630598068237305 seconds
> WARNING: libgcc-11.2.0-r0 do_cleansstate: Removing 
> /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/libgcc/package/sstate:libgcc:core2-64-poky-linux:11.2.0:r0:core2-64:7:*_package.tar.zst* 
> 
> WARNING: libgcc-11.2.0-r0 do_cleansstate: Took 0.0011608600616455078 
> seconds
> WARNING: libgcc-11.2.0-r0 do_cleansstate: Removing 
> /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/libgcc/package_qa/sstate:libgcc:core2-64-poky-linux:11.2.0:r0:core2-64:7:*_package_qa.tar.zst* 
> 
> WARNING: libgcc-11.2.0-r0 do_cleansstate: Took 0.0007557868957519531 
> seconds
> WARNING: libgcc-11.2.0-r0 do_cleansstate: Removing 
> /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/libgcc/package_write_rpm/sstate:libgcc:core2-64-poky-linux:11.2.0:r0:core2-64:7:*_package_write_rpm.tar.zst* 
> 
> WARNING: libgcc-11.2.0-r0 do_cleansstate: Took 0.0013995170593261719 
> seconds
> WARNING: libgcc-11.2.0-r0 do_cleansstate: Removing 
> /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/libgcc/packagedata/sstate:libgcc:core2-64-poky-linux:11.2.0:r0:core2-64:7:*_packagedata.tar.zst* 
> 
> WARNING: libgcc-11.2.0-r0 do_cleansstate: Took 0.0007488727569580078 
> seconds
> WARNING: libgcc-11.2.0-r0 do_cleansstate: Removing 
> /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/libgcc/populate_lic/sstate:libgcc::11.2.0:r0::7:*_populate_lic.tar.zst* 
> 
> WARNING: libgcc-11.2.0-r0 do_cleansstate: Took 0.0005896091461181641 
> seconds
> WARNING: libgcc-11.2.0-r0 do_cleansstate: Removing 
> /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/libgcc/populate_sysroot/sstate:libgcc:core2-64-poky-linux:11.2.0:r0:core2-64:7:*_populate_sysroot.tar.zst* 
> 
> WARNING: libgcc-11.2.0-r0 do_cleansstate: Took 0.00080108642578125 seconds
> 
> It's much faster.
> 
> In addition, the sub dirs now give more info, which should potentially 
> make sstate cache easier to manage.
> 
> Attached is the patch to quickly try things out. Hope to hear your 
> opinions.
> 
> Best Regards,
> 
> Chen Qi
> 

I thought this sounded interesting. Has anything come out of this that I 
missed ?

Jacob
ChenQi May 16, 2022, 5:35 a.m. UTC | #4
Hi Jacob,

No. I haven't done anything since then, because I've been involved in something else.

Regards,
Qi

-----Original Message-----
From: Jacob Kroon <jacob.kroon@gmail.com> 
Sent: Monday, May 16, 2022 1:30 PM
To: Chen, Qi <Qi.Chen@windriver.com>; openembedded-core@lists.openembedded.org; Richard Purdie <richard.purdie@linuxfoundation.org>
Subject: Re: [OE-core] About the sstate cache directory hierarchy

On 1/28/22 08:06, Chen Qi wrote:
> Hi All,
> 
> I'm sending out this email because I'm wondering if we can change the 
> sstate cache directory to use ${PN} and taskname as subditories. Hope 
> to hear your opinions.
> 
> Below is the long story.
> 
> Recently I noticed that running `bitbake xxx -c cleansstate' usually 
> takes more than 10 minutes.
> 
> After some investigation, I can see that most of the time is spent on 
> file searching. This is because we have:
> 
> SSTATE_PATHSPEC   =
> "${SSTATE_DIR}/${SSTATE_EXTRAPATHWILDCARD}${PN}/${SSTATE_PATH_CURRTASK}/${SSTATE_PKGSPEC}*_${SSTATE_PATH_CURRTASK}.tar.zst*" 
> 
> 
> And our sstate cache directory's hierarchy uses hash[:2]/hash[2:4]/ as 
> sub-directories.
> 
> This essentially means that all sub-directories are searched. This 
> would take a long time, especially when run for the first time. I made 
> some changes to  output the time and the logs are as below.
> 
> $ bitbake glibc -c cleansstate
> WARNING: glibc-2.34-r0 do_cleansstate: Removing
> /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/*/*/sstate:glibc:cor
> e2-64-poky-linux:2.34:r0:core2-64:7:*_deploy_source_date_epoch.tar.zst
> *
> 
> WARNING: glibc-2.34-r0 do_cleansstate: Took 611.8865714073181 seconds
> WARNING: glibc-2.34-r0 do_cleansstate: Removing
> /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/*/*/sstate:glibc:cor
> e2-64-poky-linux:2.34:r0:core2-64:7:*_package.tar.zst*
> 
> WARNING: glibc-2.34-r0 do_cleansstate: Took 1.3219327926635742 seconds
> WARNING: glibc-2.34-r0 do_cleansstate: Removing
> /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/*/*/sstate:glibc:cor
> e2-64-poky-linux:2.34:r0:core2-64:7:*_package_qa.tar.zst*
> 
> WARNING: glibc-2.34-r0 do_cleansstate: Took 1.470815658569336 seconds
> WARNING: glibc-2.34-r0 do_cleansstate: Removing
> /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/*/*/sstate:glibc:cor
> e2-64-poky-linux:2.34:r0:core2-64:7:*_package_write_rpm.tar.zst*
> 
> WARNING: glibc-2.34-r0 do_cleansstate: Took 1.251939058303833 seconds
> WARNING: glibc-2.34-r0 do_cleansstate: Removing
> /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/*/*/sstate:glibc:cor
> e2-64-poky-linux:2.34:r0:core2-64:7:*_packagedata.tar.zst*
> 
> WARNING: glibc-2.34-r0 do_cleansstate: Took 1.2369801998138428 seconds
> WARNING: glibc-2.34-r0 do_cleansstate: Removing
> /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/*/*/sstate:glibc::2.
> 34:r0::7:*_populate_lic.tar.zst*
> 
> WARNING: glibc-2.34-r0 do_cleansstate: Took 1.1668426990509033 seconds
> WARNING: glibc-2.34-r0 do_cleansstate: Removing
> /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/*/*/sstate:glibc:cor
> e2-64-poky-linux:2.34:r0:core2-64:7:*_populate_sysroot.tar.zst*
> 
> WARNING: glibc-2.34-r0 do_cleansstate: Took 1.385568380355835 seconds
> WARNING: glibc-2.34-r0 do_cleansstate: Removing
> /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/*/*/sstate:glibc:cor
> e2-64-poky-linux:2.34:r0:core2-64:7:*_stash_locale.tar.zst*
> 
> WARNING: glibc-2.34-r0 do_cleansstate: Took 1.4884181022644043 seconds
> 
> I figured that unlike git, we do have knowledge on our sstate objects. 
> It does not seem necessary to use hash value as sub directory. So I 
> changed the sstate directory hierarchy to use ${PN}/taskname/ as sub 
> directories, and here's the result.
> 
> $ bitbake libgcc -c cleansstate
> 
> WARNING: libgcc-11.2.0-r0 do_cleansstate: Removing
> /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/libgcc/deploy_source
> _date_epoch/sstate:libgcc:core2-64-poky-linux:11.2.0:r0:core2-64:7:*_d
> eploy_source_date_epoch.tar.zst*
> 
> WARNING: libgcc-11.2.0-r0 do_cleansstate: Took 0.020630598068237305 
> seconds
> WARNING: libgcc-11.2.0-r0 do_cleansstate: Removing
> /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/libgcc/package/sstat
> e:libgcc:core2-64-poky-linux:11.2.0:r0:core2-64:7:*_package.tar.zst*
> 
> WARNING: libgcc-11.2.0-r0 do_cleansstate: Took 0.0011608600616455078 
> seconds
> WARNING: libgcc-11.2.0-r0 do_cleansstate: Removing
> /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/libgcc/package_qa/ss
> tate:libgcc:core2-64-poky-linux:11.2.0:r0:core2-64:7:*_package_qa.tar.
> zst*
> 
> WARNING: libgcc-11.2.0-r0 do_cleansstate: Took 0.0007557868957519531 
> seconds
> WARNING: libgcc-11.2.0-r0 do_cleansstate: Removing
> /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/libgcc/package_write
> _rpm/sstate:libgcc:core2-64-poky-linux:11.2.0:r0:core2-64:7:*_package_
> write_rpm.tar.zst*
> 
> WARNING: libgcc-11.2.0-r0 do_cleansstate: Took 0.0013995170593261719 
> seconds
> WARNING: libgcc-11.2.0-r0 do_cleansstate: Removing
> /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/libgcc/packagedata/s
> state:libgcc:core2-64-poky-linux:11.2.0:r0:core2-64:7:*_packagedata.ta
> r.zst*
> 
> WARNING: libgcc-11.2.0-r0 do_cleansstate: Took 0.0007488727569580078 
> seconds
> WARNING: libgcc-11.2.0-r0 do_cleansstate: Removing
> /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/libgcc/populate_lic/
> sstate:libgcc::11.2.0:r0::7:*_populate_lic.tar.zst*
> 
> WARNING: libgcc-11.2.0-r0 do_cleansstate: Took 0.0005896091461181641 
> seconds
> WARNING: libgcc-11.2.0-r0 do_cleansstate: Removing
> /ala-lpggp72/qichen/LAT/builds/share/sstate-cache/libgcc/populate_sysr
> oot/sstate:libgcc:core2-64-poky-linux:11.2.0:r0:core2-64:7:*_populate_
> sysroot.tar.zst*
> 
> WARNING: libgcc-11.2.0-r0 do_cleansstate: Took 0.00080108642578125 
> seconds
> 
> It's much faster.
> 
> In addition, the sub dirs now give more info, which should potentially 
> make sstate cache easier to manage.
> 
> Attached is the patch to quickly try things out. Hope to hear your 
> opinions.
> 
> Best Regards,
> 
> Chen Qi
> 

I thought this sounded interesting. Has anything come out of this that I missed ?

Jacob

Patch

From 6f148d399c2ac575a9079eae682a56802dfac3ca Mon Sep 17 00:00:00 2001
From: Chen Qi <Qi.Chen@windriver.com>
Date: Thu, 27 Jan 2022 22:57:57 -0800
Subject: [PATCH] sstate.bbclasss: use /taskname as sub dirs

Signed-off-by: Chen Qi <Qi.Chen@windriver.com>
---
 meta/classes/sstate.bbclass | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/meta/classes/sstate.bbclass b/meta/classes/sstate.bbclass
index 17dcf4cc17..c657d40151 100644
--- a/meta/classes/sstate.bbclass
+++ b/meta/classes/sstate.bbclass
@@ -17,6 +17,7 @@  def generate_sstatefn(spec, hash, taskname, siginfo, d):
     if not hash:
         hash = "INVALID"
     fn = spec + hash + "_" + taskname + extension
+    pn = spec.split(':')[1]
     # If the filename is too long, attempt to reduce it
     if len(fn) > limit:
         components = spec.split(":")
@@ -30,7 +31,7 @@  def generate_sstatefn(spec, hash, taskname, siginfo, d):
         fn = spec + hash + "_" + taskname + extension
         if len(fn) > limit:
             bb.fatal("Unable to reduce sstate name to less than 255 chararacters")
-    return hash[:2] + "/" + hash[2:4] + "/" + fn
+    return pn + "/" + taskname + "/" + fn
 
 SSTATE_PKGARCH    = "${PACKAGE_ARCH}"
 SSTATE_PKGSPEC    = "sstate:${PN}:${PACKAGE_ARCH}${TARGET_VENDOR}-${TARGET_OS}:${PV}:${PR}:${SSTATE_PKGARCH}:${SSTATE_VERSION}:"
@@ -39,7 +40,7 @@  SSTATE_PKGNAME    = "${SSTATE_EXTRAPATH}${@generate_sstatefn(d.getVar('SSTATE_PK
 SSTATE_PKG        = "${SSTATE_DIR}/${SSTATE_PKGNAME}"
 SSTATE_EXTRAPATH   = ""
 SSTATE_EXTRAPATHWILDCARD = ""
-SSTATE_PATHSPEC   = "${SSTATE_DIR}/${SSTATE_EXTRAPATHWILDCARD}*/*/${SSTATE_PKGSPEC}*_${SSTATE_PATH_CURRTASK}.tar.zst*"
+SSTATE_PATHSPEC   = "${SSTATE_DIR}/${SSTATE_EXTRAPATHWILDCARD}${PN}/${SSTATE_PATH_CURRTASK}/${SSTATE_PKGSPEC}*_${SSTATE_PATH_CURRTASK}.tar.zst*"
 
 # explicitly make PV to depend on evaluated value of PV variable
 PV[vardepvalue] = "${PV}"
@@ -482,12 +483,16 @@  python sstate_hardcode_path_unpack () {
 
 def sstate_clean_cachefile(ss, d):
     import oe.path
+    import time
 
     if d.getVarFlag('do_%s' % ss['task'], 'task'):
         d.setVar("SSTATE_PATH_CURRTASK", ss['task'])
         sstatepkgfile = d.getVar('SSTATE_PATHSPEC')
-        bb.note("Removing %s" % sstatepkgfile)
+        bb.warn("Removing %s" % sstatepkgfile)
+        start = time.time()
         oe.path.remove(sstatepkgfile)
+        end = time.time()
+        bb.warn("Took %s seconds" % (end-start))
 
 def sstate_clean_cachefiles(d):
     for task in (d.getVar('SSTATETASKS') or "").split():
-- 
2.33.0