[Topaz-dev] Mulgara Performance Woes
Russell Uman
ruman at plos.org
Mon Mar 3 13:38:05 PST 2008
conflating two of ronald's posts in this response...
> Ooops, I just realized that that would lead to a deadlock:
> because axis is opening a new connection for every request
> (not a problem from an efficiency standpoint on a gig-e lan,
> hence why we never changed this), the "begin tx" would
> succeed but the next query/insert/commit could get stuck in
> the accept-queue.
>
> So, I take it back: maxThreads must indeed be large, and
> acceptCount probably 0.
in that case, perhaps the better route would be to reduce maxThreads on
the pub-app side.
in that case, if mulgara is not in the middle of a long transaction,
there should always be threads available on the pub-app.
if mulgara is hung up, the pub-app will run out of threads sooner -
perhaps (hopefully?) returning 503 to client browsers - and mulgara will
have an easier time catching up with the queue when it comes back,
leading to fewer hangs?
i still think that we don't really get out of this problem until we find
a way to get better cooperation between pub-app and mulgara when mulgara
is stuck in a long transaction.
is there anything else in the timeout train that could be inducing
abandoned sessions on the mulgara side?
-connectionTimout in tomcat (both HTTP and AJP connectors) is 20000ms,
but that doesn't seem to be an issue.
-mod_jk worker timeout is set to 600000ms
-apache timeout is set to 120s...is there any way the the apache timeout
after 2 minutes is filtering down to the pub-app? usually when apache
times out we see the action continuing to go on the pub-app side...
> Well, you build in the knowledge in the app that mulgara only
> handles a single write-tx at a time and therefore
> lock/serialize all operations on the app side, but I don't
> think that's a good idea. I think the best approach is just
> to make sure the sessions get closed quickly - part of the
> problem currently is the use of http-sessions to manage state
> and what appears to be a bug in axis about loosing cookies on
> timed out operations, something that should go away in the
> next release since we're swithing to RMI.
i don't think we can wait until 0.9 to resolve this crisis. what can we
do in the short term to fix? can we get axis to re-use the same thread?
can we get the pub-app to refrain from opening a new session after a
long wait?
i know that we can't squeeze any more juice out of mulgara until 0.9.
however, i do think we should be able to find a way to gracefully return
503 to the client when mulgara is slow, rather than piling up and
abandoning mulgara sessions so that the whole stack is hung.
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
This email is confidential to the intended recipient. If you have received it in error, please notify the sender and delete it from your system. Any unauthorized use, disclosure or copying is not permitted. The views or opinions presented are solely those of the sender and do not necessarily represent those of Public Library of Science unless otherwise specifically stated. Please note that neither Public Library of Science nor any of its agents accept any responsibility for any viruses that may be contained in this e-mail or its attachments and it is your responsibility to scan the e-mail and attachments (if any).
More information about the Topaz-Dev
mailing list