Message ID | 20250506131751.93175-1-sider123456789101112131415@gmail.com |
---|---|
State | New |
Headers | show |
Series | Speeding up Checking sstate mirror object availability : Add async filtering SSTATE_MIRRORS | expand |
Hello dear colleagues! Following the patch letter, I am attaching information about the problem we found, how we investigated it, how we fixed it and the results of the fix. As the number of addresses specified in the SSTATE_MIRRORS variable increases, the duration of the "Checking sstate mirror object availability" process also increases. This is logical, as the number of addresses to probe grows. However, it was observed that the "Checking sstate mirror object availability" process time increases even when SSTATE_MIRRORS contains one valid address and multiple unreachable addresses. To validate this observation, a testing system was developed. Two computers were connected in a local network: one performs the build, while the second shares the sstate-cache. On the second computer, the sstate-cache is hosted on a single port. Iterative builds were performed on Computer 1, with SSTATE_MIRRORS configured to include one valid port and a varying number (from 1 to 48) of "non-working" servers — addresses pointing to Computer 2 but with closed ports. Below is an example configuration: SSTATE_MIRRORS ?= " \ file://.* http://10.138.70.7:8150/sstate-cache/PATH;downloadfilename=PATH \ file://.* http://10.138.70.7:8151/sstate-cache/PATH;downloadfilename=PATH \ file://.* http://10.138.70.7:8152/sstate-cache/PATH;downloadfilename=PATH \ file://.* http://10.138.70.7:9999/sstate-cache/PATH;downloadfilename=PATH" Here, http://10.138.70.7:9999 hosts a valid sstate-cache server, while http://10.138.70.7:8150, http://10.138.70.7:8151, and http://10.138.70.7:8152 are closed ports. The time from the start of the build to the completion of the "Checking sstate mirror object availability" process was measured, and the results are visualized in the figure: https://drive.google.com/file/d/1uFwzDSnD0I_WGAeUyvNXTRb9qIxQBXqN/view?usp=sharing Thus, it can be concluded that increasing the number of addresses in SSTATE_MIRRORS leads to longer "Checking sstate mirror object availability" process times, even when those addresses are unreachable. It is also worth noting that servers can be unavailable for various reasons, ranging from internal errors to network problems or DDoS attacks - regardless of the reason, having unavailable servers specified in local.cong leads to longer build times. A patch has been implemented that, during the parsing of the local.conf configuration file, immediately after reading the SSTATE_MIRRORS variable, attempts to asynchronously establish TCP connections for each address listed in SSTATE_MIRRORS. If a connection is successfully established, the address remains in SSTATE_MIRRORS; if the connection attempt fails, the address is removed from the variable. This ensures that any addresses unreachable at the TCP level are excluded from subsequent processing. These modifications are activated only if the FILTER_SSTATE_MIRRORS variable is set to 1 in local.conf. The experiment described above was conducted after applying the patch. The results were recorded and visualized in: https://drive.google.com/file/d/1vlkjbK20iCVLpYTWhZb08gNbDpXm2_ur/view?usp=sharing As shown in the graph, the address filtering works successfully, and increasing the number of non-working servers does not lead to longer Checking sstate mirror object availability process times. For clarity, the data before and after applying the patch are also visualized in a logarithmic scale: https://drive.google.com/file/d/1BKgWtMAlI-39VnTImI8NjNOpE6rIhG-7/view?usp=sharing . Best regards, Konstantin E. *Kondratenko* вт, 6 мая 2025 г. в 16:17, KonstantinKondratenko < sider123456789101112131415@gmail.com>: > As the number of addresses specified in the SSTATE_MIRRORS variable > increases, the duration of the "Checking sstate mirror object availability" > process also increases. > This is logical, as the number of addresses to probe grows. However, it > was observed that the "Checking sstate mirror object availability" process > time increases even when SSTATE_MIRRORS contains one valid address and > multiple unreachable addresses. > > A patch has been implemented that, during the parsing of the local.conf > configuration file, immediately after reading the SSTATE_MIRRORS variable, > attempts to asynchronously establish TCP connections for each address > listed in SSTATE_MIRRORS. If a connection is successfully established, the > address remains in SSTATE_MIRRORS; if the connection attempt fails, the > address is removed from the variable. This ensures that any addresses > unreachable at the TCP level are excluded from subsequent processing. > These modifications are activated only if the FILTER_SSTATE_MIRRORS > variable is set to 1 in local.conf. > > Signed-off-by: KonstantinKondratenko <sider123456789101112131415@gmail.com > > > --- > lib/bb/cookerdata.py | 67 ++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 67 insertions(+) > > diff --git a/lib/bb/cookerdata.py b/lib/bb/cookerdata.py > index 1f447d30c2..8b642bdd90 100644 > --- a/lib/bb/cookerdata.py > +++ b/lib/bb/cookerdata.py > @@ -305,6 +305,73 @@ class CookerDataBuilder(object): > > bb.codeparser.update_module_dependencies(self.data) > > + if self.data.getVar('FILTER_SSTATE_MIRRORS') == '1': > + try: > + import asyncio > + from urllib.parse import urlparse > + > + mirrors = self.data.getVar('SSTATE_MIRRORS') or None > + if mirrors: > + parts = mirrors.split() > + mirrors_list = [' '.join(parts[i:i+2]) for i in > range(0, len(parts), 2)] > + > + def get_default_port(scheme): > + if scheme == 'http': > + return 80 > + elif scheme == 'https': > + return 443 > + elif scheme == 'ftp': > + return 21 > + elif scheme == 'ftps': > + return 990 > + else: > + return 8888 # Default value for unknown schemes > + > + async def is_port_open(address, port, timeout=5): > + try: > + reader, writer = await > asyncio.wait_for(asyncio.open_connection(address, port), timeout) > + writer.close() > + await writer.wait_closed() > + return True > + except (asyncio.TimeoutError, OSError): > + return False > + > + def extract_address_and_port(url): > + parsed_url = urlparse(url) > + address = parsed_url.hostname > + port = parsed_url.port if parsed_url.port else > get_default_port(parsed_url.scheme) > + return address, port > + > + > + async def async_mirrors_filter(mirrors_list, data): > + tasks = [] > + > + for item in mirrors_list: > + url = item.split()[1] > + address, port = extract_address_and_port(url) > + if address and port: > + tasks.append(is_port_open(address, port)) > + > + results = await asyncio.gather(*tasks) > + > + if False in results: > + output_string = " ".join( > + mirror > + for mirror, result in zip(mirrors_list, > results) > + if result > + ) > + if not output_string: > + logger.warning("All SSTATE_MIRRORS are not > available") > + else: > + logger.warning(f'Several SSTATE_MIRRORS are > not available, using: {str(output_string)}!') > + data.setVar('SSTATE_MIRRORS', output_string) > + > + if mirrors: > + asyncio.run(async_mirrors_filter(mirrors_list, > self.data)) > + > + except Exception as e: > + logger.error(f"Error checking SSTATE_MIRRORS > availability: {e}") > + > # Handle obsolete variable names > d = self.data > renamedvars = d.getVarFlags('BB_RENAMED_VARIABLES') or {} >
diff --git a/lib/bb/cookerdata.py b/lib/bb/cookerdata.py index 1f447d30c2..8b642bdd90 100644 --- a/lib/bb/cookerdata.py +++ b/lib/bb/cookerdata.py @@ -305,6 +305,73 @@ class CookerDataBuilder(object): bb.codeparser.update_module_dependencies(self.data) + if self.data.getVar('FILTER_SSTATE_MIRRORS') == '1': + try: + import asyncio + from urllib.parse import urlparse + + mirrors = self.data.getVar('SSTATE_MIRRORS') or None + if mirrors: + parts = mirrors.split() + mirrors_list = [' '.join(parts[i:i+2]) for i in range(0, len(parts), 2)] + + def get_default_port(scheme): + if scheme == 'http': + return 80 + elif scheme == 'https': + return 443 + elif scheme == 'ftp': + return 21 + elif scheme == 'ftps': + return 990 + else: + return 8888 # Default value for unknown schemes + + async def is_port_open(address, port, timeout=5): + try: + reader, writer = await asyncio.wait_for(asyncio.open_connection(address, port), timeout) + writer.close() + await writer.wait_closed() + return True + except (asyncio.TimeoutError, OSError): + return False + + def extract_address_and_port(url): + parsed_url = urlparse(url) + address = parsed_url.hostname + port = parsed_url.port if parsed_url.port else get_default_port(parsed_url.scheme) + return address, port + + + async def async_mirrors_filter(mirrors_list, data): + tasks = [] + + for item in mirrors_list: + url = item.split()[1] + address, port = extract_address_and_port(url) + if address and port: + tasks.append(is_port_open(address, port)) + + results = await asyncio.gather(*tasks) + + if False in results: + output_string = " ".join( + mirror + for mirror, result in zip(mirrors_list, results) + if result + ) + if not output_string: + logger.warning("All SSTATE_MIRRORS are not available") + else: + logger.warning(f'Several SSTATE_MIRRORS are not available, using: {str(output_string)}!') + data.setVar('SSTATE_MIRRORS', output_string) + + if mirrors: + asyncio.run(async_mirrors_filter(mirrors_list, self.data)) + + except Exception as e: + logger.error(f"Error checking SSTATE_MIRRORS availability: {e}") + # Handle obsolete variable names d = self.data renamedvars = d.getVarFlags('BB_RENAMED_VARIABLES') or {}
As the number of addresses specified in the SSTATE_MIRRORS variable increases, the duration of the "Checking sstate mirror object availability" process also increases. This is logical, as the number of addresses to probe grows. However, it was observed that the "Checking sstate mirror object availability" process time increases even when SSTATE_MIRRORS contains one valid address and multiple unreachable addresses. A patch has been implemented that, during the parsing of the local.conf configuration file, immediately after reading the SSTATE_MIRRORS variable, attempts to asynchronously establish TCP connections for each address listed in SSTATE_MIRRORS. If a connection is successfully established, the address remains in SSTATE_MIRRORS; if the connection attempt fails, the address is removed from the variable. This ensures that any addresses unreachable at the TCP level are excluded from subsequent processing. These modifications are activated only if the FILTER_SSTATE_MIRRORS variable is set to 1 in local.conf. Signed-off-by: KonstantinKondratenko <sider123456789101112131415@gmail.com> --- lib/bb/cookerdata.py | 67 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 67 insertions(+)