1) BPM is certainly suitable for processes of this size. If you're processes get more complex over time, you're ready for the growth of complexity. If you write the state-engine yourself, you'll run into issues when requirements change in the future.
2) With historyLevel set to full, all activities that are executed are audited, all tasks completed, all processes run and all variables set in them -> this is all you need for reporting and auditing.
3) There are different ways of implementing retries in activiti (e.g. expected failures due to network issues etc.). You can use the BPMNError approach, combined with error-catching events. This approach handles the error as part of the process and you can model in any number of retries. Another approach is to make the "calls for retry on exception" asynchronous. Activiti will use the job-executor to execute these parts of the process, as opposed to using the calling thead that doe the API-Call. Advantage of the job executor is that, after a failure, it can retry any number of times (by default, it's 3). If it's still failed after those X attempts, the job remains inactive. You can use our API to query for these failed jobs and check out the exception-stacktrace and message. if needed, you can ask Activiti to execute the job again any number of times. The process will be inactive until a successful execution of the job was done..