How to optimize application startup time

This page lists various approaches on how to optimize application startup time. Feel free to discuss this page and share your knowledge by improving this page.

The approaches are separated in three areas:

  • Toolchain level – this includes various optimization in the linker, the compiler and the dynamic linker.
  • Platform level – this includes various approaches that platforms offer.
  • Application level – this includes everything that the app itself can do to start up faster

Toolchain level

Linker

  • Link Time Optimization (LTO) or Whole Program Optimization (WPO) can be used to improve startup times.
    • gcc documentation [gcc.gnu.org]
    • MSVC documentation [msdn.microsoft.com]).aspx
    • Background: By creating usage statistics of your application, the linker can re-arrange the object code to improve loading time.
  • (GNU ld) Use
    1. -Bsymbolic-functions
    for your shared libraries. This tells the linker to use direct local jumps to symbols within your library instead of trying to resolve them by the usual means. The effect is that every function call within your library will be initially faster since there’s no lookup required. This leads to faster load times.
    • Note: Side-effects are that it’s impossible to use LD_PRELOAD to override a symbol in a library.
    • There’s a QMAKE variable QMAKE_LFLAGS_BSYMBOLIC_FUNC that expands to the corresponding linker flag if the linker supports symbolic functions.
    • Note: You can create a whitelist of symbols that will ignore the -Bsymbolic-functions switch, using the —dynamic-list parameter. See for example QtCore.dynlist in the QtCore source tree.
  • (GNU ld) Make sure to use GNU style hashes for symbol lookup (—hash-style=gnu). This is the default on Linux, however, some toolchains might still default to the old sysv hash style, which has slower symbol lookup when using shared libraries. GNU hash style improves startup time by improving the time to resolve symbols.
  • Profiling startup time optimizations
    • valgrind:http://valgrind.org/ provides support via callgrind/cachegrind to measure the time spent before your main function, which includes symbol resolving and the time spent in initialization code for dependent libraries.
    • (GNU ld) Use the LD_DEBUG environment variable to output statistics from the dynamic linker.

Platform level

  • (MeeGo) MeeGo supports “boosted” applications. See Harmattan_Booster_for_Qt_Quick_Applications on how to enable your Qt Quick applications to be boosted and the MeeGo launcher documentation [apidocs.meego.com] on how to boost Qt and generic apps.
    • Note: Although the boosters are part of MeeGo, the core parts are written in play C++ and can be re-used on other platforms as well.
    • Note: You have to rewrite a small portion of your app, and you need to compile your app as position independent executable
    • Background: MeeGo is pre-launching a few processes in the background that wait for the actual app to launch. Since all initialization is already done (e.g. QApplication constructor already ran), the app startup is perceived considerably faster.
  • (KDE) kdeinit is used to start applications
    • Note: This approach can be adapted to other platforms as well to improve startup times of multiple apps
    • Background: kdeinit is a pre-started process that links to various core libraries, so symbol resolving and library mapping into memory is already partially done. When the actual application is started, kdeinit forks and executes it. The time required to resolve symbols goes down, thus startup time goes up.
  • Cache things
    • Example: MeeGo shader cache [qt.gitorious.org] compiles OpenGL shaders into a binary representation and puts them in a shared memory area for other apps to use. Only the first application startup will be slow, since that has to populate the cache. All further apps start faster.
    • Example (KDE): KDE uses an icon cache to prevent that every icon is loaded / processed over and over again.

Application level

  • QML apps: See Performance_tip_Use_Loaders
  • Lazy initialization
    • Load things only when you need them, not on application startup
    • Don’t use static global objects. The code that initializes that global object runs before the main() function, thus startup time goes up. Instead, use the Singleton pattern to create your global static object the first time that it is used.

Categories: