Monday, June 8, 2015

poolboy pitfall 2

The poolboy issue described in this article was successfully fixed, thereby in order to avoid it just make sure that you use at least version 1.5.1 of the library. Unfortunately another problem popped up.

Retries with poolboy

For some resources it is important to do retries within given time before reporting failure to the caller. From the first sight poolboy library provides all necessary features.

  1. Internally it uses supervisor for restarting terminated processes (for retries);
  2. Client can specify a timeout to wait for worker checkout.
So in worker's code one just needs to terminate a process if resource is not available and retries will be organised by poolboy.

handle_call(die, _From, State) ->
    {stop, {error, died}, dead, State}.

The issue

In fact the situation is a bit more complicated. Whenever gen_server callback returns a stop-tuple, it just instructs the underlying code in OTP to terminate the process, it is not stated in documentation, if caller gets response first and then the process terminates or vice versa. In addition to that poolboy's supervisor is notified about worker's termination, which is not reported to the caller without additional link/monitor.
So processes of worker's termination and checking out from the pool are not synchronized, as a result poolboy:checkout function might return a Pid of worker, which is already terminated. Further usage of the Pid will lead to exception exit: {noproc, ...}. Obviously the same could happen using poolboy:transaction function.
Client can handle this error by reporting failure, but that does not fulfill the requirement of retries for given time period.
The issue is reported in best traditions of TDD as a pull request with failing tests.

Workarounds

Since it was not fixed quickly the issue seems to be quite fundamental. In order to continue using poolboy some workaround is required.
The issue popped up, when I tried to organise retry logic by means of poolboy. An obvious workaround for this would be moving this logic to either worker or client, but my perfectionism did not allow me to expose such a complexity to that level.
Fortunately, commiters of the project advised me a simple technique to overcome the problem. Worker should check-in itself back to the pool in case of success, so that termination and check-in are synchronised.
handle_call(die, _From, State) ->
    {stop, {error, died}, dead, State};
handle_call(ok, _From, State) ->
    poolboy:checkin(pool_name, self()),
    {reply, ok, State}.
This trick implies, that poolboy:transaction can not be used anymore. It also breaks separation of concerns and abstraction, because worker starts "knowing" about the pool. But overall I find it as a "good deal" comparing to other workarounds for it's simplicity.

Friday, June 5, 2015

OOP in Erlang. Part 2. Polymorphism

Encapsulation was covered in previous article of my OOP in Erlang series.

Polymorphism

Polymorphism is ability to present the same interface for different instances (types).

Processes

First of all, process is a unit of encapsulation in Erlang, polymorphism could be implemented on that level.
Client can send the same message to different processes, which will handle the message differently.
For example, we can implement gen_server behaviour in two modules.
serv1.erl
handle_call(message, _From, State) ->
  {reply, 1, State}.
serv2.erl
handle_call(message, _From, State) ->
  {reply, 2, State}.
Then client chooses one of gen_servers and starts it, saving Pid of the process. And then gen_server:call(Pid, message) can be called and caller experiences different behaviour based on module chosen in the beginning.

Dynamic types

Erlang is dynamically typed language, as a result module and even function name could be taken from variable during function call. For example, interfaces of dict and orddict from OTP are unified, and following code shows polymorphism implementation.
Module =
       case Ordering of
           unordered ->
               dict;
           ordered ->
               ordict
       end,
Container = Module:from_list(Values),
Module:find(Key, Container).

Pattern matching

Pattern matching is a powerful feature of the language, which can be used for polymorphism implementation. The idea is that function changes it's behaviour depending on arguments it gets. The most obvious data type to match on is record.
move(#duck{}) -> move_duck();
move(#fish{}) -> move_fish().
Extensive usage of pattern matching for that purpose leads to decoupling of "object" from it's behaviour (business logic), as a result for adding of a new "type" changes in multiple modules are required. This is very similar to Anemic domain model, which has some advantages though.

Conclusion

Polymorphism in Erlang could me implemented in different ways. Even though processes provide the most clear interface for that, developer should not create processes only for modeling business logic, because:
  1. Logic changes quite often and changing all boilerplate code of processes is an overhead.
  2. System becomes more complicated with each new type of process spawned.