OutSystems BPT — Timeouts and exceptions in ‘Light Processes’ — Part I
Hello low coders!
If you are using ‘Light Business Processes’ in your applications, I have some tips for you. I have been using light processes in a project, and I needed to make them resilient to failure. For that, I had to do some research on what happens behind the scenes.
Light Processes timeout
As you may already know, one of the conditions to allow light process execution is that a process flow must be comprised of one single ‘Automatic Activity’.
Also important to have in mind is that in ‘Light Processes’ the ‘Automatic Activities’ time out after 3 minutes (instead of 5 minutes in traditional processes).
Another difference is that ‘Light Processes’ have 20 threads per front-end by default while traditional BPT have 10 threads available per front-end.
✋ Let's test this!
- In my ‘Automatic Activity’, I add a 4 minutes sleep. After 3 minutes an exception should be thrown preventing the sleep action to finish.
- I launch a process instance and while I wait, I open Service Center.
- There it is! An error appears in the logs: ”Scheduler Service: Error executing light event…Request duration = 180 secs.”
💡 Size your logic to run under 3 minutes to avoid problems.
Error handling
In regular BPT processes, an error raised in the scope of an ‘Automatic Activity’ will cause its execution flow to stop. The platform will retry the flow again after a while until it is successfully executed or the underlying process is terminated.
✋ How long does it take for a ‘Light Process’ to be retried if something goes wrong? We shall investigate!
- In my example process, I force an exception to occur after a 1-minute sleep this time.
- I launch two instances and I monitor the ‘Light Processes’ events system table (ossys_BPM_Event). No errors and ready to be processed in a jiffy.
- After the first execution I see the Error_Count going from 0 to 1 (as expected) and the Next_Run is scheduled 1 minute later (start at 13:45:40 + 1 minute of sleep + error + retry 1 minute later = 13:47:41).
- I wait for the retries, and I monitor the Next_Run timestamps. After the second retry, I manually updated the Next_Run fields so that I don’t have to wait too much.
- I am a bit surprised with the waiting intervals: 1 hour for the 2nd retry to happen; two and a half days for the 3rd retry; 5 months for the 4th; 25 years for the 5th one! 😱
- I stopped here because I was scared of what came after 25 years. 😁
+-------------+----------+
| Error Count | Next Run |
+-------------+----------+
| 0 | - |
| 1 | 1 minute |
| 2 | 1 hour |
| 3 | 2.5 days |
| 4 | 5 months |
| 5 | 25 years |
+-------------+----------+
- This test was performed in my ‘Personal Environment’ running Version 11.11.0 (Build 26942) of the ‘OutSystems Platform’.
💡 If you are expecting your processes to recover quickly, maybe you are safe until the 2nd retry. After that, your business requirements might be impacted by the delay of 2.5 days.
⚠️ These values are merely indicative. Don’t take them for granted. Test the Platform to see how far you can push it. 🚀
I have brought this to the attention of OutSystems and it’s under investigation.
Part II
Stay tuned for part II where I will present more tips and recommendations to handle exceptions in ‘Light Processes’.
SQL snippets
If you want to do your own ‘Environment Health’, here’s some SQL to get you started. The SQL executed in Service Center must be a bit more complex but not so far away.
I used carbon.now.sh to make the code snippets and ozh.github.io/ascii-tables for the ASCII table.