Discussion:
Problem with Samba re-share of a CIFS mount
Gionatan Danti
2014-02-11 09:30:13 UTC
Permalink
Hi all,
I have a strange problem trying to re-share, via Samba, a CIFS mount.

Let first explain my network topology:

Win2008R2 w/SMB share -> low speed link -> Linux Box -> WIN7 clients

In short, a Win2008 R2 share is being accessed by some W7 clients over a
slow (~8 Mb/s downstream, ~1 Mb/s upstream) ADSL link. To speed up read
operation on the branch office, I thought to use a Linux Box with
cachefilesd and a CONFIG_CIFS_FSCACHE enabled kernel. The Linux box will
mount the Win2008R2 share and, thanks to cachefilesd, it will maintain
an "hot cache" of the read data. This CIFS mount is then shared, via
Samba, to the other client Win7 PCs.

However, the problem is that when re-sharing the CIFS mount, the Win7
clients often see the many directories inside the mount as a regular
files, and not directories! In other words, if I have a directory "test"
inside the mount, the client PC will see a _file_ called "test". When
double-clicking on that "file", the Win7 client even ask to select the
application to open it.

The strange this is that this problem happen with some Linux kernel
version, but not with others. These are my results:

1) Stock CentOS 6.5 x86-64 system (kernel 2.6.32-431.1.2.0.1, cifs-utils
4.8.1-19, samba 3.6.9-167): no problem here, but this kernel does not
have CONFIG_CIFS_FSCACHE, so I can not use it for speeding up read access;

2) CentOS 6.5 x86-64 with ElRepo updates (kernel 3.10.28-1): here
CONFIG_CIFS_FSCACHE is enabled, but I have the problem described above;

3) Debian 7 amd64 with latest updates (kernel 3.2.54-2, cifs-utils
2:5.5-1): CONFIG_CIFS_FSCACHE is enabled, problem happens;

4) Fedora 20 x86-64 (kernel 3.12.8-300, cifs-utils 6.3-1, samba
4.1.3-2): CONFIG_CIFS_FSCACHE is enabled and problem does _not_ happen,
however this is a client distro and I am not so comfortable to put it
into production.

Anyone has any explanation of what is happening here?
Regards.
--
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti-N44kj/***@public.gmane.org - info-N44kj/***@public.gmane.org
GPG public key ID: FF5F32A8
Jeff Layton
2014-02-11 15:33:02 UTC
Permalink
On Tue, 11 Feb 2014 10:30:13 +0100
Post by Gionatan Danti
Hi all,
I have a strange problem trying to re-share, via Samba, a CIFS mount.
Win2008R2 w/SMB share -> low speed link -> Linux Box -> WIN7 clients
In short, a Win2008 R2 share is being accessed by some W7 clients over a
slow (~8 Mb/s downstream, ~1 Mb/s upstream) ADSL link. To speed up read
operation on the branch office, I thought to use a Linux Box with
cachefilesd and a CONFIG_CIFS_FSCACHE enabled kernel. The Linux box will
mount the Win2008R2 share and, thanks to cachefilesd, it will maintain
an "hot cache" of the read data. This CIFS mount is then shared, via
Samba, to the other client Win7 PCs.
However, the problem is that when re-sharing the CIFS mount, the Win7
clients often see the many directories inside the mount as a regular
files, and not directories! In other words, if I have a directory "test"
inside the mount, the client PC will see a _file_ called "test". When
double-clicking on that "file", the Win7 client even ask to select the
application to open it.
The strange this is that this problem happen with some Linux kernel
1) Stock CentOS 6.5 x86-64 system (kernel 2.6.32-431.1.2.0.1, cifs-utils
4.8.1-19, samba 3.6.9-167): no problem here, but this kernel does not
have CONFIG_CIFS_FSCACHE, so I can not use it for speeding up read access;
2) CentOS 6.5 x86-64 with ElRepo updates (kernel 3.10.28-1): here
CONFIG_CIFS_FSCACHE is enabled, but I have the problem described above;
3) Debian 7 amd64 with latest updates (kernel 3.2.54-2, cifs-utils
2:5.5-1): CONFIG_CIFS_FSCACHE is enabled, problem happens;
4) Fedora 20 x86-64 (kernel 3.12.8-300, cifs-utils 6.3-1, samba
4.1.3-2): CONFIG_CIFS_FSCACHE is enabled and problem does _not_ happen,
however this is a client distro and I am not so comfortable to put it
into production.
Anyone has any explanation of what is happening here?
Regards.
Most likely, the problem is that the cifs mount is returning an
st_nlink value of 0 for directories, and that confuses samba into
thinking that directories are files (I forget their rationale for this).

More recent kernels have patches that make the client fake up st_nlink
values when the server sends 0 for a NumberOfLinks value.
--
Jeff Layton <jlayton-***@public.gmane.org>
Gionatan Danti
2014-02-11 15:50:45 UTC
Permalink
Hi Jeff,
I had the same idea.

When mounting the CIFS directory, the problematic installations return 0
links for both dirs and files. On the other hand, the stock CentOS
installation return 1 or more links.

It puzzled me. Two questions:
- anyone know the rationale behind this?
- how it is possible to work-around that with an unpatched kernel?

Thank you and regards.
Post by Jeff Layton
On Tue, 11 Feb 2014 10:30:13 +0100
Post by Gionatan Danti
Hi all,
I have a strange problem trying to re-share, via Samba, a CIFS mount.
Win2008R2 w/SMB share -> low speed link -> Linux Box -> WIN7 clients
In short, a Win2008 R2 share is being accessed by some W7 clients over a
slow (~8 Mb/s downstream, ~1 Mb/s upstream) ADSL link. To speed up read
operation on the branch office, I thought to use a Linux Box with
cachefilesd and a CONFIG_CIFS_FSCACHE enabled kernel. The Linux box will
mount the Win2008R2 share and, thanks to cachefilesd, it will maintain
an "hot cache" of the read data. This CIFS mount is then shared, via
Samba, to the other client Win7 PCs.
However, the problem is that when re-sharing the CIFS mount, the Win7
clients often see the many directories inside the mount as a regular
files, and not directories! In other words, if I have a directory "test"
inside the mount, the client PC will see a _file_ called "test". When
double-clicking on that "file", the Win7 client even ask to select the
application to open it.
The strange this is that this problem happen with some Linux kernel
1) Stock CentOS 6.5 x86-64 system (kernel 2.6.32-431.1.2.0.1, cifs-utils
4.8.1-19, samba 3.6.9-167): no problem here, but this kernel does not
have CONFIG_CIFS_FSCACHE, so I can not use it for speeding up read access;
2) CentOS 6.5 x86-64 with ElRepo updates (kernel 3.10.28-1): here
CONFIG_CIFS_FSCACHE is enabled, but I have the problem described above;
3) Debian 7 amd64 with latest updates (kernel 3.2.54-2, cifs-utils
2:5.5-1): CONFIG_CIFS_FSCACHE is enabled, problem happens;
4) Fedora 20 x86-64 (kernel 3.12.8-300, cifs-utils 6.3-1, samba
4.1.3-2): CONFIG_CIFS_FSCACHE is enabled and problem does _not_ happen,
however this is a client distro and I am not so comfortable to put it
into production.
Anyone has any explanation of what is happening here?
Regards.
Most likely, the problem is that the cifs mount is returning an
st_nlink value of 0 for directories, and that confuses samba into
thinking that directories are files (I forget their rationale for this).
More recent kernels have patches that make the client fake up st_nlink
values when the server sends 0 for a NumberOfLinks value.
--
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti-N44kj/***@public.gmane.org - info-N44kj/***@public.gmane.org
GPG public key ID: FF5F32A8
Steve French
2014-02-11 16:59:35 UTC
Permalink
We did make changes in this area - what are the kernel versions of the two?
Post by Gionatan Danti
Hi Jeff,
I had the same idea.
When mounting the CIFS directory, the problematic installations return 0
links for both dirs and files. On the other hand, the stock CentOS
installation return 1 or more links.
- anyone know the rationale behind this?
- how it is possible to work-around that with an unpatched kernel?
Thank you and regards.
Post by Jeff Layton
On Tue, 11 Feb 2014 10:30:13 +0100
Post by Gionatan Danti
Hi all,
I have a strange problem trying to re-share, via Samba, a CIFS mount.
Win2008R2 w/SMB share -> low speed link -> Linux Box -> WIN7 clients
In short, a Win2008 R2 share is being accessed by some W7 clients over a
slow (~8 Mb/s downstream, ~1 Mb/s upstream) ADSL link. To speed up read
operation on the branch office, I thought to use a Linux Box with
cachefilesd and a CONFIG_CIFS_FSCACHE enabled kernel. The Linux box will
mount the Win2008R2 share and, thanks to cachefilesd, it will maintain
an "hot cache" of the read data. This CIFS mount is then shared, via
Samba, to the other client Win7 PCs.
However, the problem is that when re-sharing the CIFS mount, the Win7
clients often see the many directories inside the mount as a regular
files, and not directories! In other words, if I have a directory "test"
inside the mount, the client PC will see a _file_ called "test". When
double-clicking on that "file", the Win7 client even ask to select the
application to open it.
The strange this is that this problem happen with some Linux kernel
1) Stock CentOS 6.5 x86-64 system (kernel 2.6.32-431.1.2.0.1, cifs-utils
4.8.1-19, samba 3.6.9-167): no problem here, but this kernel does not
have CONFIG_CIFS_FSCACHE, so I can not use it for speeding up read access;
2) CentOS 6.5 x86-64 with ElRepo updates (kernel 3.10.28-1): here
CONFIG_CIFS_FSCACHE is enabled, but I have the problem described above;
3) Debian 7 amd64 with latest updates (kernel 3.2.54-2, cifs-utils
2:5.5-1): CONFIG_CIFS_FSCACHE is enabled, problem happens;
4) Fedora 20 x86-64 (kernel 3.12.8-300, cifs-utils 6.3-1, samba
4.1.3-2): CONFIG_CIFS_FSCACHE is enabled and problem does _not_ happen,
however this is a client distro and I am not so comfortable to put it
into production.
Anyone has any explanation of what is happening here?
Regards.
Most likely, the problem is that the cifs mount is returning an
st_nlink value of 0 for directories, and that confuses samba into
thinking that directories are files (I forget their rationale for this).
More recent kernels have patches that make the client fake up st_nlink
values when the server sends 0 for a NumberOfLinks value.
--
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
GPG public key ID: FF5F32A8
--
To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Thanks,

Steve
Gionatan Danti
2014-02-11 17:05:49 UTC
Permalink
Hi,
these are my tests and results, complete with kernel and packages versions:

1) Stock CentOS 6.5 x86-64 system (kernel 2.6.32-431.1.2.0.1, cifs-utils
4.8.1-19, samba 3.6.9-167): no problem here, but this kernel does not
have CONFIG_CIFS_FSCACHE, so I can not use it for speeding up read
access;

2) CentOS 6.5 x86-64 with ElRepo updates (kernel 3.10.28-1): here
CONFIG_CIFS_FSCACHE is enabled, but I have the problem described above;

3) Debian 7 amd64 with latest updates (kernel 3.2.54-2, cifs-utils
2:5.5-1): CONFIG_CIFS_FSCACHE is enabled, problem happens;

4) Fedora 20 x86-64 (kernel 3.12.8-300, cifs-utils 6.3-1, samba
4.1.3-2): CONFIG_CIFS_FSCACHE is enabled and problem does _not_ happen,
however this is a client distro and I am not so comfortable to put it
into production.

In all cases, the share was published by a Win2008R2 server.

Continuing in my search, I found this:
https://lists.samba.org/archive/samba-technical/2013-August/094532.html

I can confirm that by forcing the use of CIFS ACL (using the cifsacl
mount options) the problem disappear even on the problematic setups. An
ls -al on the mount folder show 1 or more links. However, I am not sure
if this is a good workaround.

Let me know you opinions.
Regards.
Post by Gionatan Danti
1) Stock CentOS 6.5 x86-64 system (kernel 2.6.32-431.1.2.0.1, cifs-utils
4.8.1-19, samba 3.6.9-167): no problem here, but this kernel does not
have CONFIG_CIFS_FSCACHE, so I can not use it for speeding up read access;
2) CentOS 6.5 x86-64 with ElRepo updates (kernel 3.10.28-1): here
CONFIG_CIFS_FSCACHE is enabled, but I have the problem described above;
3) Debian 7 amd64 with latest updates (kernel 3.2.54-2, cifs-utils
2:5.5-1): CONFIG_CIFS_FSCACHE is enabled, problem happens;
4) Fedora 20 x86-64 (kernel 3.12.8-300, cifs-utils 6.3-1, samba
4.1.3-2): CONFIG_CIFS_FSCACHE is enabled and problem does_not_ happen,
however this is a client distro and I am not so comfortable to put it
into production.
--
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti-N44kj/***@public.gmane.org - info-N44kj/***@public.gmane.org
GPG public key ID: FF5F32A8
Jeff Layton
2014-02-11 17:45:36 UTC
Permalink
This post might be inappropriate. Click to display it.
Steve French
2014-02-11 18:01:50 UTC
Permalink
Post by Jeff Layton
On Tue, 11 Feb 2014 16:50:45 +0100
Post by Gionatan Danti
Hi Jeff,
I had the same idea.
When mounting the CIFS directory, the problematic installations return 0
links for both dirs and files. On the other hand, the stock CentOS
installation return 1 or more links.
- anyone know the rationale behind this?
The rationale is that windows servers always send a NumberOfLinks value
of '0' for directories. We have a hack in place that went in around a
year ago to work around that for (arguably broken) applications that
try to infer something about an inode that has a zero st_nlink value.
Post by Gionatan Danti
- how it is possible to work-around that with an unpatched kernel?
There is no workaround. Either fix the application such that it doesn't
care or patch the kernel. I'll cc Jim since he did a fair bit of
looking at this several months ago.
In truth though, resharing a cifs mount is probably not a great
solution. It sounds like the kind of setup that's going to end up being
fraught with cache coherency problems...
Problem is that there are situations where it is required (usually
due to legacy dialect support or due to legacy authentication
support).

I am not as worried about the cache coherence issues if
we are mounting "cache=none" and we can even set
actimeo lower if needed
--
Thanks,

Steve
Jeff Layton
2014-02-13 11:37:38 UTC
Permalink
On Tue, 11 Feb 2014 12:01:50 -0600
Post by Steve French
Post by Jeff Layton
On Tue, 11 Feb 2014 16:50:45 +0100
Post by Gionatan Danti
Hi Jeff,
I had the same idea.
When mounting the CIFS directory, the problematic installations return 0
links for both dirs and files. On the other hand, the stock CentOS
installation return 1 or more links.
- anyone know the rationale behind this?
The rationale is that windows servers always send a NumberOfLinks value
of '0' for directories. We have a hack in place that went in around a
year ago to work around that for (arguably broken) applications that
try to infer something about an inode that has a zero st_nlink value.
Post by Gionatan Danti
- how it is possible to work-around that with an unpatched kernel?
There is no workaround. Either fix the application such that it doesn't
care or patch the kernel. I'll cc Jim since he did a fair bit of
looking at this several months ago.
In truth though, resharing a cifs mount is probably not a great
solution. It sounds like the kind of setup that's going to end up being
fraught with cache coherency problems...
Problem is that there are situations where it is required (usually
due to legacy dialect support or due to legacy authentication
support).
I am not as worried about the cache coherence issues if
we are mounting "cache=none" and we can even set
actimeo lower if needed
Using cache=none sort of defeats the purpose. After all Gionatan said
that he was doing this specifically to use fscache, and that won't work
with cache=none.

But, lets leave that aside for a moment and consider whether this could
work at all. Assume we have samba set up re-share a cifs mount:

Client sends an open to samba and requests an oplock. Samba then opens
a file on the cifs mount, and does not request an oplock (because of
cache=none). We then attempt to set a lease, which will fail because we
don't have an oplock. Now you're no better off (and probably worse off)
since you have zero caching going on and are having to bounce each
request through an extra hop.

So, suppose you disable "kernel oplocks" in samba in order to get samba
to hand out L2 oplocks in this situation. Another client then comes
along on the main (primary) server and changes a file. Samba is then
not aware of that change and hilarity (aka data corruption) ensues.

I just don't see how re-sharing a cifs mount is a good idea, unless you
are absolutely certain that the data you're resharing won't ever
change. If that's the case, then you're almost certainly better off
keeping a local copy on the samba server and sharing that out.
--
Jeff Layton <jlayton-***@public.gmane.org>
Gionatan Danti
2014-02-13 17:29:45 UTC
Permalink
Post by Jeff Layton
Using cache=none sort of defeats the purpose. After all Gionatan said
that he was doing this specifically to use fscache, and that won't work
with cache=none.
Surely my idea was to use FSCACHE to speed up remote access. Without it,
the entire discussion is pointless...
Post by Jeff Layton
But, lets leave that aside for a moment and consider whether this could
Client sends an open to samba and requests an oplock. Samba then opens
a file on the cifs mount, and does not request an oplock (because of
cache=none). We then attempt to set a lease, which will fail because we
don't have an oplock. Now you're no better off (and probably worse off)
since you have zero caching going on and are having to bounce each
request through an extra hop.
So, suppose you disable "kernel oplocks" in samba in order to get samba
to hand out L2 oplocks in this situation. Another client then comes
along on the main (primary) server and changes a file. Samba is then
not aware of that change and hilarity (aka data corruption) ensues.
Are you of the same advice for low-frequency file changes (eg: office
files)?

What about using NFS to export the Fileserver directory, mount it (via
mount.nfs) on the remote Linux box and then sharing via Samba? It is a
horrible frankenstein?
Post by Jeff Layton
I just don't see how re-sharing a cifs mount is a good idea, unless you
are absolutely certain that the data you're resharing won't ever
change. If that's the case, then you're almost certainly better off
keeping a local copy on the samba server and sharing that out.
After many tests, I tend to agree. Using a Fedora 20 test machine with
fscache+cachefilesd as the remote Linux box, I had one kernel panic and
multiple failed file copies (with Windows complaing about a "bad
signature").

I also found this: https://bugzilla.redhat.com/show_bug.cgi?id=646224
Maybe the CIFS FSCACHE is not really production-grade on latest distros
also?

Thank you and regards.
--
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti-N44kj/***@public.gmane.org - info-N44kj/***@public.gmane.org
GPG public key ID: FF5F32A8
Steve French
2014-02-13 18:04:32 UTC
Permalink
Post by Jeff Layton
Using cache=none sort of defeats the purpose. After all Gionatan said
that he was doing this specifically to use fscache, and that won't work
with cache=none.
Surely my idea was to use FSCACHE to speed up remote access. Without it, the
entire discussion is pointless...
Post by Jeff Layton
But, lets leave that aside for a moment and consider whether this could
Client sends an open to samba and requests an oplock. Samba then opens
a file on the cifs mount, and does not request an oplock (because of
cache=none). We then attempt to set a lease, which will fail because we
don't have an oplock. Now you're no better off (and probably worse off)
since you have zero caching going on and are having to bounce each
request through an extra hop.
So, suppose you disable "kernel oplocks" in samba in order to get samba
to hand out L2 oplocks in this situation. Another client then comes
along on the main (primary) server and changes a file. Samba is then
not aware of that change and hilarity (aka data corruption) ensues.
Are you of the same advice for low-frequency file changes (eg: office
files)?
What about using NFS to export the Fileserver directory, mount it (via
mount.nfs) on the remote Linux box and then sharing via Samba? It is a
horrible frankenstein?
Post by Jeff Layton
I just don't see how re-sharing a cifs mount is a good idea, unless you
are absolutely certain that the data you're resharing won't ever
change. If that's the case, then you're almost certainly better off
keeping a local copy on the samba server and sharing that out.
After many tests, I tend to agree. Using a Fedora 20 test machine with
fscache+cachefilesd as the remote Linux box, I had one kernel panic and
multiple failed file copies (with Windows complaing about a "bad
signature").
I also found this: https://bugzilla.redhat.com/show_bug.cgi?id=646224
Maybe the CIFS FSCACHE is not really production-grade on latest distros
also?
I have not found fscache to be a problem in my tests, but did find
problems with Samba 4.1 reexporting cifs directories.

I am investigating this so any log information that you have or
additional problem determination details would be appreciated.
--
Thanks,

Steve
Gionatan Danti
2014-02-14 10:27:10 UTC
Permalink
Post by Steve French
I have not found fscache to be a problem in my tests, but did find
problems with Samba 4.1 reexporting cifs directories.
I am investigating this so any log information that you have or
additional problem determination details would be appreciated.
Hi,
I have some difficulties in replicating the kernel panic; it happened
one single time only. I will surely harvest in the log, but do you have
any suggestions on what to search?

Thanks.
--
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti-N44kj/***@public.gmane.org - info-N44kj/***@public.gmane.org
GPG public key ID: FF5F32A8
Jeff Layton
2014-02-13 19:40:38 UTC
Permalink
On Thu, 13 Feb 2014 18:29:45 +0100
Post by Gionatan Danti
Post by Jeff Layton
Using cache=none sort of defeats the purpose. After all Gionatan said
that he was doing this specifically to use fscache, and that won't work
with cache=none.
Surely my idea was to use FSCACHE to speed up remote access. Without it,
the entire discussion is pointless...
Post by Jeff Layton
But, lets leave that aside for a moment and consider whether this could
Client sends an open to samba and requests an oplock. Samba then opens
a file on the cifs mount, and does not request an oplock (because of
cache=none). We then attempt to set a lease, which will fail because we
don't have an oplock. Now you're no better off (and probably worse off)
since you have zero caching going on and are having to bounce each
request through an extra hop.
So, suppose you disable "kernel oplocks" in samba in order to get samba
to hand out L2 oplocks in this situation. Another client then comes
along on the main (primary) server and changes a file. Samba is then
not aware of that change and hilarity (aka data corruption) ensues.
Are you of the same advice for low-frequency file changes (eg: office
files)?
What about using NFS to export the Fileserver directory, mount it (via
mount.nfs) on the remote Linux box and then sharing via Samba? It is a
horrible frankenstein?
You'll have similar problems with NFS.

You can't acquire leases on NFS either, so with kernel oplocks enabled
on samba you won't ever get oplocks on there. If you turn them off (so
that oplocks are tracked internally) you won't be aware of changes that
occur outside of samba.
Post by Gionatan Danti
Post by Jeff Layton
I just don't see how re-sharing a cifs mount is a good idea, unless you
are absolutely certain that the data you're resharing won't ever
change. If that's the case, then you're almost certainly better off
keeping a local copy on the samba server and sharing that out.
After many tests, I tend to agree. Using a Fedora 20 test machine with
fscache+cachefilesd as the remote Linux box, I had one kernel panic and
multiple failed file copies (with Windows complaing about a "bad
signature").
I also found this: https://bugzilla.redhat.com/show_bug.cgi?id=646224
Maybe the CIFS FSCACHE is not really production-grade on latest distros
also?
I don't recall whether Suresh ever fixed those bugs. cifs+fsc is
certainly not widely used, and it wouldn't surprise me if it were still
horribly buggy.

fscache is somewhat at odds with the fundamental caching model of the
cifs protocol. The whole point of fscache is to speed up access to
frequently read files when a client starts up, and to reduce load on the
server in these cases.

For NFS, that works because we rely on looking at inode attributes to
determine whether the file has changed (i.e. the mtime, size, NFSv4
change attribute). So, with NFS we can reasonably tell whether a file
has changed across a client remount.

For CIFS, things are different. The protocol basically states that you
should only cache file data if you hold an oplock, and you only get an
oplock when you open a file. When you first bring up a client, you
don't hold one, so you really should just toss out any data that you're
caching...thereby making fscache sort of pointless.

Now, there is some argument that you can use fsc and still follow the
protocol by using it as "swap for pagecache". IOW, you could use it
to cache a large amount of open file data than you have memory. I'm not
aware of anyone having actually tested to see if that works however.
--
Jeff Layton <jlayton-***@public.gmane.org>
Suresh Jayaraman
2014-02-14 02:14:56 UTC
Permalink
Post by Jeff Layton
On Thu, 13 Feb 2014 18:29:45 +0100
Post by Gionatan Danti
Post by Jeff Layton
Using cache=none sort of defeats the purpose. After all Gionatan said
that he was doing this specifically to use fscache, and that won't work
with cache=none.
Surely my idea was to use FSCACHE to speed up remote access. Without it,
the entire discussion is pointless...
Post by Jeff Layton
But, lets leave that aside for a moment and consider whether this could
Client sends an open to samba and requests an oplock. Samba then opens
a file on the cifs mount, and does not request an oplock (because of
cache=none). We then attempt to set a lease, which will fail because we
don't have an oplock. Now you're no better off (and probably worse off)
since you have zero caching going on and are having to bounce each
request through an extra hop.
So, suppose you disable "kernel oplocks" in samba in order to get samba
to hand out L2 oplocks in this situation. Another client then comes
along on the main (primary) server and changes a file. Samba is then
not aware of that change and hilarity (aka data corruption) ensues.
Are you of the same advice for low-frequency file changes (eg: office
files)?
What about using NFS to export the Fileserver directory, mount it (via
mount.nfs) on the remote Linux box and then sharing via Samba? It is a
horrible frankenstein?
You'll have similar problems with NFS.
You can't acquire leases on NFS either, so with kernel oplocks enabled
on samba you won't ever get oplocks on there. If you turn them off (so
that oplocks are tracked internally) you won't be aware of changes that
occur outside of samba.
Post by Gionatan Danti
Post by Jeff Layton
I just don't see how re-sharing a cifs mount is a good idea, unless you
are absolutely certain that the data you're resharing won't ever
change. If that's the case, then you're almost certainly better off
keeping a local copy on the samba server and sharing that out.
After many tests, I tend to agree. Using a Fedora 20 test machine with
fscache+cachefilesd as the remote Linux box, I had one kernel panic and
multiple failed file copies (with Windows complaing about a "bad
signature").
I also found this: https://bugzilla.redhat.com/show_bug.cgi?id=646224
Maybe the CIFS FSCACHE is not really production-grade on latest distros
also?
I don't recall whether Suresh ever fixed those bugs. cifs+fsc is
If you are referring to this oops http://thread.gmane.org/gmane.linux.file-systems.cachefs.general/2961
it was fixed by the below commit

commit c902ce1bfb40d8b049bd2319b388b4b68b04bc27
Author: David Howells <dhowells-H+wXaHxf7aLQT0dZR+***@public.gmane.org>
Date: Thu Jul 7 12:19:48 2011 +0100

FS-Cache: Add a helper to bulk uncache pages on an inode

I remember verifying it by running fsstress for many hours then. I'm not sure what other bugs you are referring to.
Post by Jeff Layton
certainly not widely used, and it wouldn't surprise me if it were still
horribly buggy.
Just curious, why would you say so?



- Suresh Jayaraman
Jeff Layton
2014-02-14 12:06:49 UTC
Permalink
On Fri, 14 Feb 2014 02:14:56 +0000
Post by Suresh Jayaraman
Post by Jeff Layton
On Thu, 13 Feb 2014 18:29:45 +0100
Post by Gionatan Danti
Post by Jeff Layton
Using cache=none sort of defeats the purpose. After all Gionatan said
that he was doing this specifically to use fscache, and that won't work
with cache=none.
Surely my idea was to use FSCACHE to speed up remote access. Without it,
the entire discussion is pointless...
Post by Jeff Layton
But, lets leave that aside for a moment and consider whether this could
Client sends an open to samba and requests an oplock. Samba then opens
a file on the cifs mount, and does not request an oplock (because of
cache=none). We then attempt to set a lease, which will fail because we
don't have an oplock. Now you're no better off (and probably worse off)
since you have zero caching going on and are having to bounce each
request through an extra hop.
So, suppose you disable "kernel oplocks" in samba in order to get samba
to hand out L2 oplocks in this situation. Another client then comes
along on the main (primary) server and changes a file. Samba is then
not aware of that change and hilarity (aka data corruption) ensues.
Are you of the same advice for low-frequency file changes (eg: office
files)?
What about using NFS to export the Fileserver directory, mount it (via
mount.nfs) on the remote Linux box and then sharing via Samba? It is a
horrible frankenstein?
You'll have similar problems with NFS.
You can't acquire leases on NFS either, so with kernel oplocks enabled
on samba you won't ever get oplocks on there. If you turn them off (so
that oplocks are tracked internally) you won't be aware of changes that
occur outside of samba.
Post by Gionatan Danti
Post by Jeff Layton
I just don't see how re-sharing a cifs mount is a good idea, unless you
are absolutely certain that the data you're resharing won't ever
change. If that's the case, then you're almost certainly better off
keeping a local copy on the samba server and sharing that out.
After many tests, I tend to agree. Using a Fedora 20 test machine with
fscache+cachefilesd as the remote Linux box, I had one kernel panic and
multiple failed file copies (with Windows complaing about a "bad
signature").
I also found this: https://bugzilla.redhat.com/show_bug.cgi?id=646224
Maybe the CIFS FSCACHE is not really production-grade on latest distros
also?
I don't recall whether Suresh ever fixed those bugs. cifs+fsc is
If you are referring to this oops http://thread.gmane.org/gmane.linux.file-systems.cachefs.general/2961
it was fixed by the below commit
commit c902ce1bfb40d8b049bd2319b388b4b68b04bc27
Date: Thu Jul 7 12:19:48 2011 +0100
FS-Cache: Add a helper to bulk uncache pages on an inode
I remember verifying it by running fsstress for many hours then. I'm not sure what other bugs you are referring to.
Ahh thanks. I don't think we ever turned on CONFIG_CIFS_FSCACHE in
rhel6, so I'm not sure what sort of problem Gionatan was hitting.
Post by Suresh Jayaraman
Post by Jeff Layton
certainly not widely used, and it wouldn't surprise me if it were still
horribly buggy.
Just curious, why would you say so?
I haven't heard of many people using it, and features that don't get
widely used don't tend to be widely tested. Not a reflection on your
work, but more of a statement that it was more of a niche feature that
hasn't been widely deployed.

I certainly could be wrong on that point however. I haven't played with
it in quite some time.
--
Jeff Layton <jlayton-***@public.gmane.org>
Gionatan Danti
2014-02-14 10:25:17 UTC
Permalink
Post by Jeff Layton
You'll have similar problems with NFS.
You can't acquire leases on NFS either, so with kernel oplocks enabled
on samba you won't ever get oplocks on there. If you turn them off (so
that oplocks are tracked internally) you won't be aware of changes that
occur outside of samba.
Ok, so it is the SMB protocol which is intrinsically unfriendly to
persistent caches. Maybe it is for this very same reason that Microsoft
suggest to use the "offline files" feature only where files change on
one single side (eg: laptop).
Post by Jeff Layton
I don't recall whether Suresh ever fixed those bugs. cifs+fsc is
certainly not widely used, and it wouldn't surprise me if it were still
horribly buggy.
fscache is somewhat at odds with the fundamental caching model of the
cifs protocol. The whole point of fscache is to speed up access to
frequently read files when a client starts up, and to reduce load on the
server in these cases.
For NFS, that works because we rely on looking at inode attributes to
determine whether the file has changed (i.e. the mtime, size, NFSv4
change attribute). So, with NFS we can reasonably tell whether a file
has changed across a client remount.
For CIFS, things are different. The protocol basically states that you
should only cache file data if you hold an oplock, and you only get an
oplock when you open a file. When you first bring up a client, you
don't hold one, so you really should just toss out any data that you're
caching...thereby making fscache sort of pointless.
What about not having an oplock but watch at file attributes (eg: last
modified date)? I think cache=loose do this same thing. The man page say
that Windows can be "lazy" to update file's attribute, but I think that
we are speaking of some seconds at most. In scenarios with low
cross-editing probability, this some-second window seems reasonable
small. I am missing something?
Post by Jeff Layton
Now, there is some argument that you can use fsc and still follow the
protocol by using it as "swap for pagecache". IOW, you could use it
to cache a large amount of open file data than you have memory. I'm not
aware of anyone having actually tested to see if that works however.
Mmm... this "swap for pagecache" will not survive reboot, right? Or,
better stated, the cache _will_ survive, but by not having any oplock,
the client will request the live data from the server, right?

Thank you.
--
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti-N44kj/***@public.gmane.org - info-N44kj/***@public.gmane.org
GPG public key ID: FF5F32A8
Jeff Layton
2014-02-14 12:17:24 UTC
Permalink
On Fri, 14 Feb 2014 11:25:17 +0100
Post by Gionatan Danti
Post by Jeff Layton
You'll have similar problems with NFS.
You can't acquire leases on NFS either, so with kernel oplocks enabled
on samba you won't ever get oplocks on there. If you turn them off (so
that oplocks are tracked internally) you won't be aware of changes that
occur outside of samba.
Ok, so it is the SMB protocol which is intrinsically unfriendly to
persistent caches. Maybe it is for this very same reason that Microsoft
suggest to use the "offline files" feature only where files change on
one single side (eg: laptop).
Post by Jeff Layton
I don't recall whether Suresh ever fixed those bugs. cifs+fsc is
certainly not widely used, and it wouldn't surprise me if it were still
horribly buggy.
fscache is somewhat at odds with the fundamental caching model of the
cifs protocol. The whole point of fscache is to speed up access to
frequently read files when a client starts up, and to reduce load on the
server in these cases.
For NFS, that works because we rely on looking at inode attributes to
determine whether the file has changed (i.e. the mtime, size, NFSv4
change attribute). So, with NFS we can reasonably tell whether a file
has changed across a client remount.
For CIFS, things are different. The protocol basically states that you
should only cache file data if you hold an oplock, and you only get an
oplock when you open a file. When you first bring up a client, you
don't hold one, so you really should just toss out any data that you're
caching...thereby making fscache sort of pointless.
What about not having an oplock but watch at file attributes (eg: last
modified date)? I think cache=loose do this same thing. The man page say
that Windows can be "lazy" to update file's attribute, but I think that
we are speaking of some seconds at most. In scenarios with low
cross-editing probability, this some-second window seems reasonable
small. I am missing something?
That's basically what cache=loose does with cifs. One might consider
that to be an NFS-like caching model. That used to be the default
behavior, but we changed it a few years ago since strict adherence
to the protocol is really the only way to ensure that you don't end up
with data corruption.

The main problem with that is that Windows servers do lazy updates to
their LastWriteTime (aka mtime), so watching for mtime changes is not a
reliable method for detecting when a file has changed.
Post by Gionatan Danti
Post by Jeff Layton
Now, there is some argument that you can use fsc and still follow the
protocol by using it as "swap for pagecache". IOW, you could use it
to cache a large amount of open file data than you have memory. I'm not
aware of anyone having actually tested to see if that works however.
Mmm... this "swap for pagecache" will not survive reboot, right? Or,
better stated, the cache _will_ survive, but by not having any oplock,
the client will request the live data from the server, right?
That's correct. If you however, mount with cache=loose then fsc should
persist across reboots as long as the files don't appear to have
changed. That has its own problems however, particularly if you're
dealing with Windows servers (see the comment above about lazy updates
to LastWriteTime).

One thing you could consider is looking into BranchCache if you are
using a relatively recent Windows infrastructure:

http://technet.microsoft.com/en-us/network/dd425028.aspx

Chris Hertel had also started a project to implement something similar
on unix-y OS' as well, but I'm not sure of the current state of that
work.
--
Jeff Layton <jlayton-***@public.gmane.org>
Gionatan Danti
2014-02-14 14:10:16 UTC
Permalink
Post by Jeff Layton
That's basically what cache=loose does with cifs. One might consider
that to be an NFS-like caching model. That used to be the default
behavior, but we changed it a few years ago since strict adherence
to the protocol is really the only way to ensure that you don't end up
with data corruption.
The main problem with that is that Windows servers do lazy updates to
their LastWriteTime (aka mtime), so watching for mtime changes is not a
reliable method for detecting when a file has changed.
Ok
Post by Jeff Layton
That's correct. If you however, mount with cache=loose then fsc should
persist across reboots as long as the files don't appear to have
changed. That has its own problems however, particularly if you're
dealing with Windows servers (see the comment above about lazy updates
to LastWriteTime).
This also match with what I observed through my tests :)
Post by Jeff Layton
One thing you could consider is looking into BranchCache if you are
http://technet.microsoft.com/en-us/network/dd425028.aspx
Chris Hertel had also started a project to implement something similar
on unix-y OS' as well, but I'm not sure of the current state of that
work.
True, but I can't use Windows 2008R2/2012 on the remote box (this is a
requirement), so branchcache is not an option here. After all, in a
Windows server to Windows server scenario, DFSR would be a very
compelling solution (even better then branchcache, in my opinion).

Let me thank all you for the time dedicated to me and to the samba project!
--
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti-N44kj/***@public.gmane.org - info-N44kj/***@public.gmane.org
GPG public key ID: FF5F32A8
Jeff Layton
2014-02-14 12:08:46 UTC
Permalink
On Thu, 13 Feb 2014 18:29:45 +0100
Post by Gionatan Danti
Post by Jeff Layton
Using cache=none sort of defeats the purpose. After all Gionatan said
that he was doing this specifically to use fscache, and that won't work
with cache=none.
Surely my idea was to use FSCACHE to speed up remote access. Without it,
the entire discussion is pointless...
Post by Jeff Layton
But, lets leave that aside for a moment and consider whether this could
Client sends an open to samba and requests an oplock. Samba then opens
a file on the cifs mount, and does not request an oplock (because of
cache=none). We then attempt to set a lease, which will fail because we
don't have an oplock. Now you're no better off (and probably worse off)
since you have zero caching going on and are having to bounce each
request through an extra hop.
So, suppose you disable "kernel oplocks" in samba in order to get samba
to hand out L2 oplocks in this situation. Another client then comes
along on the main (primary) server and changes a file. Samba is then
not aware of that change and hilarity (aka data corruption) ensues.
Are you of the same advice for low-frequency file changes (eg: office
files)?
What about using NFS to export the Fileserver directory, mount it (via
mount.nfs) on the remote Linux box and then sharing via Samba? It is a
horrible frankenstein?
Post by Jeff Layton
I just don't see how re-sharing a cifs mount is a good idea, unless you
are absolutely certain that the data you're resharing won't ever
change. If that's the case, then you're almost certainly better off
keeping a local copy on the samba server and sharing that out.
After many tests, I tend to agree. Using a Fedora 20 test machine with
fscache+cachefilesd as the remote Linux box, I had one kernel panic and
multiple failed file copies (with Windows complaing about a "bad
signature").
I also found this: https://bugzilla.redhat.com/show_bug.cgi?id=646224
Maybe the CIFS FSCACHE is not really production-grade on latest distros
also?
BTW, if you're seeing panics or other problems then please do report
them. As Suresh points out, the bug in that RHBZ should now be fixed.
If you're still seeing a panic in that code then we do want to fix that.
--
Jeff Layton <jlayton-***@public.gmane.org>
Gionatan Danti
2014-02-14 14:05:32 UTC
Permalink
Post by Jeff Layton
BTW, if you're seeing panics or other problems then please do report
them. As Suresh points out, the bug in that RHBZ should now be fixed.
If you're still seeing a panic in that code then we do want to fix that.
I'll try to replicate the panic and, if I succeed, I will report back.

Thank you.
--
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti-N44kj/***@public.gmane.org - info-N44kj/***@public.gmane.org
GPG public key ID: FF5F32A8
Gionatan Danti
2014-02-11 18:09:47 UTC
Permalink
Post by Jeff Layton
On Tue, 11 Feb 2014 16:50:45 +0100
The rationale is that windows servers always send a NumberOfLinks value
of '0' for directories. We have a hack in place that went in around a
year ago to work around that for (arguably broken) applications that
try to infer something about an inode that has a zero st_nlink value.
There is no workaround. Either fix the application such that it doesn't
care or patch the kernel. I'll cc Jim since he did a fair bit of
looking at this several months ago.
In truth though, resharing a cifs mount is probably not a great
solution. It sounds like the kind of setup that's going to end up being
fraught with cache coherency problems...
Ok, I understand now :)

Thank you very much, Jeff.
--
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti-N44kj/***@public.gmane.org - info-N44kj/***@public.gmane.org
GPG public key ID: FF5F32A8
Continue reading on narkive:
Search results for 'Problem with Samba re-share of a CIFS mount' (Questions and Answers)
3
replies
samba server?
started 2007-04-06 11:34:55 UTC
computer networking
Loading...