Discussion:
[Haskell-beginners] how to read file with locking
Joey Hess
2010-10-09 22:49:12 UTC
Permalink
The function below always returns "", rather than the file's contents.
_Real World Haskell_ touches on how laziness can make this problimatic,
without really giving a solution, other than throwing in a putStr to
force evaluation, which can't be done here. How can I make hGetContents
strict, to ensure the file contents are really read before it gets closed?

readFile' file = do
f <- openFile file ReadMode
-- locking will go here
s <- hGetContents f
hClose f
return s

Also, I noticed that opening a file and locking it involves a
very verbose seeming dance. (It's 2 lines of code in most other languages.)
Does this indicate that few people bother with file locking in Haskell
and so it still has these rough edges, or that there's a better way to do
it that I have not found yet?

openLocked file = do
handle <- openFile file ReadMode
lockfd <- handleToFd handle -- closes handle
waitToSetLock lockfd (ReadLock, AbsoluteSeek, 0, 0)
handle' <- fdToHandle lockfd
return handle'
--
see shy jo
Jimmy Wylie
2010-10-10 01:40:11 UTC
Permalink
Post by Joey Hess
The function below always returns "", rather than the file's contents.
_Real World Haskell_ touches on how laziness can make this problimatic,
without really giving a solution, other than throwing in a putStr to
force evaluation, which can't be done here. How can I make hGetContents
strict, to ensure the file contents are really read before it gets closed?
readFile' file = do
f<- openFile file ReadMode
-- locking will go here
s<- hGetContents f
hClose f
return s
Haskell won't actually read the file unless you need the contents of the
file. In order to ensure that the file contents are read before it gets
close, you need to actually perform an operation that will force
evaluation of the file. Haskell will read the file then to perform
whatever operation you specify.
for example
sendFile file = do
f<- openFile file ReadMode
s <- hGetContents f
sendToNetwork s --made up function
hClose f

In your example, you're getting "" because the handle is closed, so
Haskell doesn't have anything to read.
You say you can't use putStr in this scenario, but if you're reading the
file you must be doing something with the contents. Wait to close the
handle until after that operation takes place.
Post by Joey Hess
Also, I noticed that opening a file and locking it involves a
very verbose seeming dance. (It's 2 lines of code in most other languages.)
Does this indicate that few people bother with file locking in Haskell
and so it still has these rough edges, or that there's a better way to do
it that I have not found yet?
openLocked file = do
handle<- openFile file ReadMode
lockfd<- handleToFd handle -- closes handle
waitToSetLock lockfd (ReadLock, AbsoluteSeek, 0, 0)
handle'<- fdToHandle lockfd
return handle'
You should go to haskell.org/hoogle, and search openFile, or check out
the Haskell98 report, but I'm under the impression that ghc locks the
Isaac Dupree
2010-10-10 01:52:35 UTC
Permalink
Post by Joey Hess
The function below always returns "", rather than the file's contents.
_Real World Haskell_ touches on how laziness can make this problimatic,
without really giving a solution, other than throwing in a putStr to
force evaluation, which can't be done here. How can I make hGetContents
strict, to ensure the file contents are really read before it gets closed?
There are various hacks to do it (including ones with no external
side-effects that just use "case" / "seq" / etc. quite thoroughly)...
But realistically you should use a library that doesn't add laziness, if
laziness is not what you want. For example, "strict" on Hackage
http://hackage.haskell.org/package/strict
http://hackage.haskell.org/packages/archive/strict/0.3.2/doc/html/System-IO-Strict.html

-Isaac
Ertugrul Soeylemez
2010-10-10 04:39:11 UTC
Permalink
Post by Joey Hess
The function below always returns "", rather than the file's contents.
_Real World Haskell_ touches on how laziness can make this
problimatic, without really giving a solution, other than throwing in
a putStr to force evaluation, which can't be done here. How can I make
hGetContents strict, to ensure the file contents are really read
before it gets closed?
In general you don't want to read a file as a String non-lazily. That
uses a lot of memory, because String is just a type alias for [Char],
which means a linked list of Char values.

To read an entire file eagerly the proper way is in most cases using
Data.ByteString or Data.ByteString.Char8, which have a strict interface.
A ByteString from those modules is always either non-evaluated or
completely evaluated, and it is a real byte array instead of a linked
list. Both modules feature hGetContents:

import qualified Data.ByteString.Char8 as B

readFileLocked :: FilePath -> IO B.ByteString
readFileLocked fn =
withFile fn ReadMode $ \h -> do
lockFile h -- for a suitable function lockFile
B.hGetContents h

It is, BTW, always preferable to use withFile over openFile, if you can.
This makes your code cleaner and also exception-safe.


Greets,
Ertugrul
--
nightmare = unsafePerformIO (getWrongWife >>= sex)
http://ertes.de/
Joey Hess
2010-10-11 04:29:35 UTC
Permalink
Post by Ertugrul Soeylemez
readFileLocked :: FilePath -> IO B.ByteString
readFileLocked fn =
withFile fn ReadMode $ \h -> do
lockFile h -- for a suitable function lockFile
B.hGetContents h
It is, BTW, always preferable to use withFile over openFile, if you can.
This makes your code cleaner and also exception-safe.
Unless there is a better locking primative than waitToSetLock available,
I don't know how to build your lockFile function. It seems that it would
have a side effect of closing the handle. It could return a new handle
like this, but then withFile's automatic close of the file would be
defeated.

lockFile h = do
lockfd <- handleToFd h -- closes h
waitToSetLock lockfd (lockType mode, AbsoluteSeek, 0, 0)
newh <- fdToHandle lockfd
return newh

Here's what I'm using instead.

withFileLocked file mode action = do
-- TODO: find a way to use bracket here
handle <- openFile file mode
lockfd <- handleToFd handle -- closes handle
waitToSetLock lockfd (lockType mode, AbsoluteSeek, 0, 0)
handle' <- fdToHandle lockfd
ret <- action handle'
hClose handle'
return ret
where
lockType ReadMode = ReadLock
lockType _ = WriteLock

BTW, thanks for the hint that ByteString has a strict getContents! I was
prototyping my code with String and thought I'd worry about optimisation
later, but that is a good reason to use ByteString up front.
Post by Ertugrul Soeylemez
Implementations should enforce as far as possible, at least locally to the
Haskell process, multiple-reader single-writer locking on files.
According to strace, this does not involve any system-level locking with
flock/fcntl/lockf. It is done internally to the ghc process.
--
see shy jo
Jimmy Wylie
2010-10-11 05:30:04 UTC
Permalink
Post by Joey Hess
I don't know how to build your lockFile function. It seems that it would
have a side effect of closing the handle. It could return a new handle
like this, but then withFile's automatic close of the file would be
defeated.
lockFile h = do
lockfd<- handleToFd h -- closes h
waitToSetLock lockfd (lockType mode, AbsoluteSeek, 0, 0)
newh<- fdToHandle lockfd
return newh
Here's what I'm using instead.
withFileLocked file mode action = do
-- TODO: find a way to use bracket here
handle<- openFile file mode
lockfd<- handleToFd handle -- closes handle
waitToSetLock lockfd (lockType mode, AbsoluteSeek, 0, 0)
handle'<- fdToHandle lockfd
ret<- action handle'
hClose handle'
return ret
where
lockType ReadMode = ReadLock
lockType _ = WriteLock
I was looking here:
http://www.haskell.org/ghc/docs/6.12.2/html/libraries/unix-2.4.0.1/System-Posix-IO.html

Instead of creating the handle, then converting to an fd, only to return
a new handle, why don't you start with the file descriptor and convert
at the end of the process. I don't have time for a full piece of code,
but maybe something like this:

lockFile file = do
fd <- openFd ReadOnly Nothing defaultFileFlags
waitToSetLock fd (lockType mode, AbsoluteSeek, 0, 0)
handle <- fdToHandle fd
return handle

You could also use the same sort of code in your withFileLocked function.
Post by Joey Hess
According to strace, this does not involve any system-level locking with
flock/fcntl/lockf. It is done internally to the ghc process.
Thanks for testing that out. I appreciate the information.

Ciao,
Jimmy
Ertugrul Soeylemez
2010-10-11 05:39:01 UTC
Permalink
Post by Joey Hess
Post by Ertugrul Soeylemez
readFileLocked :: FilePath -> IO B.ByteString
readFileLocked fn =
withFile fn ReadMode $ \h -> do
lockFile h -- for a suitable function lockFile
B.hGetContents h
It is, BTW, always preferable to use withFile over openFile, if you
can. This makes your code cleaner and also exception-safe.
Unless there is a better locking primative than waitToSetLock
available, I don't know how to build your lockFile function. It seems
that it would have a side effect of closing the handle. It could
return a new handle like this, but then withFile's automatic close of
the file would be defeated.
I wasn't addressing the locking issue, but rather the laziness issue.
The lockFile function was just a placeholder.


Greets,
Ertugrul
--
nightmare = unsafePerformIO (getWrongWife >>= sex)
http://ertes.de/
Jimmy Wylie
2010-10-11 05:14:54 UTC
Permalink
Post by Ertugrul Soeylemez
It is, BTW, always preferable to use withFile over openFile, if you can.
This makes your code cleaner and also exception-safe.
Hi Ertugrul,

I don't think I quite understand. How is withFile exception-safe?
Under the covers it's using openFile. I was under the impression
withFile was just a nice way to remove boilerplate file operation code.
Here's what I found on hoogle: "withFile
<http://haskell.org/hoogle/?hoogle=withFile> name mode act opens a file
using openFile <http://haskell.org/hoogle/?hoogle=openFile> and passes
the resulting handle to the computation act. The handle will be closed
on exit from withFile <http://haskell.org/hoogle/?hoogle=withFile>,
whether by normal termination or by raising an exception."


Thanks,
Jimmy
Ertugrul Soeylemez
2010-10-11 05:35:25 UTC
Permalink
Post by Jimmy Wylie
Post by Ertugrul Soeylemez
It is, BTW, always preferable to use withFile over openFile, if you can.
This makes your code cleaner and also exception-safe.
I don't think I quite understand. How is withFile exception-safe?
Under the covers it's using openFile. I was under the impression
withFile was just a nice way to remove boilerplate file operation code.
Here's what I found on hoogle: "withFile
<http://haskell.org/hoogle/?hoogle=withFile> name mode act opens a file
using openFile <http://haskell.org/hoogle/?hoogle=openFile> and passes
the resulting handle to the computation act. The handle will be closed
on exit from withFile <http://haskell.org/hoogle/?hoogle=withFile>,
whether by normal termination or by raising an exception."
The last statement is the point:

"[...] whether by normal termination or by raising an exception."

If you openFile and hClose manually, then you need to take care of
exceptions yourself. You need to make sure that hClose is called in all
cases. For example, what if reading the file throws an exception, which
is not catched? hClose may be skipped that way. withFile ensures that
the handle is always closed, even if an exception escapes your function.


Greets,
Ertugrul
--
nightmare = unsafePerformIO (getWrongWife >>= sex)
http://ertes.de/
Brandon S Allbery KF8NH
2010-10-12 11:31:53 UTC
Permalink
I don't think I quite understand. How is withFile exception-safe? Under
the covers it's using openFile. I was under the impression withFile was just
Under the covers it's also using Control.Exception.bracket, which you should
probably look at.

- --
brandon s. allbery [linux,solaris,freebsd,perl] ***@kf8nh.com
system administrator [openafs,heimdal,too many hats] ***@ece.cmu.edu
electrical and computer engineering, carnegie mellon university KF8NH
Loading...