blocking write() after disconnecting cifs server

Discussion:

m***@public.gmane.org

2013-12-03 13:16:08 UTC

Hello,

write() from cifs kernel driver blocks when disconnecting the cifs serv=
er. The blocking call didn't return after 30 minutes. Client and server=
are connected via a switch and server's LAN cable is unplugged during =
the write call. I use kernel 3.11.8 and mounted without "hard" option.

Is there a possibility for an non-blocking write() without using O_SYNC=
or "directio" mount option?

Way to reproduce the scenario: Below is a sample program which calls wr=
ite() in a loop. The error messages appear when unplugging the cable du=
ring this loop.

Kind regards,
Hagen

CIFS VFS: sends on sock ffff88003710c280 stuck for 15 seconds
CIFS VFS: Error -11 sending data on socket to server

#include <fstream>
#include <iostream>
int main () {
=C2=A0 const int size =3D 100000;
=C2=A0 char buffer[size];
=C2=A0 std::ofstream outfile("/mnt/new.bin",std::ofstream::binary);
=C2=A0 if (!outfile.is_open())
=C2=A0 {
=C2=A0=C2=A0=C2=A0 return 1;
=C2=A0 }
=C2=A0 for (int idx=3D0; idx<10000 && outfile.good(); idx++)
=C2=A0 {
=C2=A0=C2=A0=C2=A0 outfile.write(buffer,size);
=C2=A0=C2=A0=C2=A0 std::cout << "written, size=3D" << size << std::endl=
;
=C2=A0 }
=C2=A0 std::cout << "finished " << outfile.good() << std::endl;
=C2=A0 outfile.close();
=C2=A0 return 0;
}

Jeff Layton

2013-12-23 10:46:43 UTC

Permalink

On Tue, 3 Dec 2013 14:16:08 +0100 (CET)
mail654-***@public.gmane.org wrote:

> Hello,
>=20
> write() from cifs kernel driver blocks when disconnecting the cifs se=
rver. The blocking call didn't return after 30 minutes. Client and serv=
er are connected via a switch and server's LAN cable is unplugged durin=
g the write call. I use kernel 3.11.8 and mounted without "hard" option=
=2E
>=20
> Is there a possibility for an non-blocking write() without using O_SY=
NC or "directio" mount option?
>=20
> Way to reproduce the scenario: Below is a sample program which calls =
write() in a loop. The error messages appear when unplugging the cable =
during this loop.
>=20
> Kind regards,
> Hagen
>=20
> CIFS VFS: sends on sock ffff88003710c280 stuck for 15 seconds
> CIFS VFS: Error -11 sending data on socket to server
>=20
> #include <fstream>
> #include <iostream>
> int main () {
> =A0 const int size =3D 100000;
> =A0 char buffer[size];
> =A0 std::ofstream outfile("/mnt/new.bin",std::ofstream::binary);
> =A0 if (!outfile.is_open())
> =A0 {
> =A0=A0=A0 return 1;
> =A0 }
> =A0 for (int idx=3D0; idx<10000 && outfile.good(); idx++)
> =A0 {
> =A0=A0=A0 outfile.write(buffer,size);
> =A0=A0=A0 std::cout << "written, size=3D" << size << std::endl;
> =A0 }
> =A0 std::cout << "finished " << outfile.good() << std::endl;
> =A0 outfile.close();
> =A0 return 0;
> }

A hang of that length is unexpected. If you're able to reproduce this,
can you get the stack from the task issuing the write at the time?

$ cat /proc/<pid>/stack

That might give us a clue as to what it's doing.
--=20
Jeff Layton <jlayton-***@public.gmane.org>

Jeff Layton

2014-01-02 19:31:57 UTC

Permalink

On Thu, 2 Jan 2014 17:04:27 +0100 (CET)
mail654-***@public.gmane.org wrote:

> > > write() from cifs kernel driver blocks when disconnecting the cif=
s server. The blocking call didn't return after 30 minutes. Client and =
server are connected via a switch and server's LAN cable is unplugged d=
uring the write call. I use kernel 3.11.8 and mounted without "hard" op=
tion.
> > >=20
> > > Is there a possibility for an non-blocking write() without using =
O_SYNC or "directio" mount option?
> > >=20
> > > Way to reproduce the scenario: Below is a sample program which ca=
lls write() in a loop. The error messages appear when unplugging the ca=
ble during this loop.
> > >=20
> > > Kind regards,
> > > Hagen
> > >=20
> > > CIFS VFS: sends on sock ffff88003710c280 stuck for 15 seconds
> > > CIFS VFS: Error -11 sending data on socket to server
> > >=20
> > > #include <fstream>
> > > #include <iostream>
> > > int main () {
> > > =C2=A0 const int size =3D 100000;
> > > =C2=A0 char buffer[size];
> > > =C2=A0 std::ofstream outfile("/mnt/new.bin",std::ofstream::binary=
);
> > > =C2=A0 if (!outfile.is_open())
> > > =C2=A0 {
> > > =C2=A0=C2=A0=C2=A0 return 1;
> > > =C2=A0 }
> > > =C2=A0 for (int idx=3D0; idx<10000 && outfile.good(); idx++)
> > > =C2=A0 {
> > > =C2=A0=C2=A0=C2=A0 outfile.write(buffer,size);
> > > =C2=A0=C2=A0=C2=A0 std::cout << "written, size=3D" << size << std=
::endl;
> > > =C2=A0 }
> > > =C2=A0 std::cout << "finished " << outfile.good() << std::endl;
> > > =C2=A0 outfile.close();
> > > =C2=A0 return 0;
> > > }
> >=20
> > A hang of that length is unexpected. If you're able to reproduce th=
is,
> > can you get the stack from the task issuing the write at the time?
> >=20
> > $ cat /proc/<pid>/stack
> >=20
> > That might give us a clue as to what it's doing.
>=20
> [<ffffffff8170ab8c>] balance_dirty_pages.isra.19+0x4ac/0x55c
> [<ffffffff8115455b>] balance_dirty_pages_ratelimited+0xeb/0x110
> [<ffffffff81148f3a>] generic_perform_write+0x16a/0x210
> [<ffffffff8114903d>] generic_file_buffered_write+0x5d/0x90
> [<ffffffff8114aa66>] __generic_file_aio_write+0x1b6/0x3b0
> [<ffffffff8114acc9>] generic_file_aio_write+0x69/0xd0
> [<ffffffffa03ef225>] cifs_strict_writev+0xa5/0xd0 [cifs]
> [<ffffffff811b2b95>] do_sync_readv_writev+0x65/0x90
> [<ffffffff811b4312>] do_readv_writev+0xd2/0x2b0
> [<ffffffff811b452c>] vfs_writev+0x3c/0x50
> [<ffffffff811b46a2>] SyS_writev+0x52/0xc0
> [<ffffffff8172976f>] tracesys+0xe1/0xe6
> [<ffffffffffffffff>] 0xffffffffffffffff
>=20

Looks like it's stuck in dirty page throttling.

What's likely happening is that you have a bunch of dirty pages when
you go to pull the cable. At that point the system is trying to flush
the pages so that this task can try to dirty more of them.

What *should* happen (at least if this is a soft mount) is that the
writeback of those pages eventually times out, the pages get their
error bit set and eventually the write() syscalls go through.

Have you tried stracing this and are able to tell that the write
syscall never returns in this situation? Is it possible that the
write() syscalls are returning, albeit slowly?

--=20
Jeff Layton <jlayton-***@public.gmane.org>

m***@public.gmane.org

2014-01-03 09:10:32 UTC

Permalink

jlayton-***@public.gmane.org wrote:
> On Thu, 2 Jan 2014 17:04:27 +0100 (CET)
> mail654-***@public.gmane.org wrote:
>=20
> > > > write() from cifs kernel driver blocks when disconnecting the c=
ifs server. The blocking call didn't return after 30 minutes. Client an=
d server are connected via a switch and server's LAN cable is unplugged=
during the write call. I use kernel 3.11.8 and mounted without "hard" =
option.
> > > >=20
> > > > Is there a possibility for an non-blocking write() without usin=
g O_SYNC or "directio" mount option?
> > > >=20
> > > > Way to reproduce the scenario: Below is a sample program which =
calls write() in a loop. The error messages appear when unplugging the =
cable during this loop.
> > > >=20
> > > > Kind regards,
> > > > Hagen
> > > >=20
> > > > CIFS VFS: sends on sock ffff88003710c280 stuck for 15 seconds
> > > > CIFS VFS: Error -11 sending data on socket to server
> > > >=20
> > > > #include <fstream>
> > > > #include <iostream>
> > > > int main () {
> > > > =C2=A0 const int size =3D 100000;
> > > > =C2=A0 char buffer[size];
> > > > =C2=A0 std::ofstream outfile("/mnt/new.bin",std::ofstream::bina=
ry);
> > > > =C2=A0 if (!outfile.is_open())
> > > > =C2=A0 {
> > > > =C2=A0=C2=A0=C2=A0 return 1;
> > > > =C2=A0 }
> > > > =C2=A0 for (int idx=3D0; idx<10000 && outfile.good(); idx++)
> > > > =C2=A0 {
> > > > =C2=A0=C2=A0=C2=A0 outfile.write(buffer,size);
> > > > =C2=A0=C2=A0=C2=A0 std::cout << "written, size=3D" << size << s=
td::endl;
> > > > =C2=A0 }
> > > > =C2=A0 std::cout << "finished " << outfile.good() << std::endl;
> > > > =C2=A0 outfile.close();
> > > > =C2=A0 return 0;
> > > > }
> > >=20
> > > A hang of that length is unexpected. If you're able to reproduce =
this,
> > > can you get the stack from the task issuing the write at the time=
?
> > >=20
> > > $ cat /proc/<pid>/stack
> > >=20
> > > That might give us a clue as to what it's doing.
> >=20
> > [<ffffffff8170ab8c>] balance_dirty_pages.isra.19+0x4ac/0x55c
> > [<ffffffff8115455b>] balance_dirty_pages_ratelimited+0xeb/0x110
> > [<ffffffff81148f3a>] generic_perform_write+0x16a/0x210
> > [<ffffffff8114903d>] generic_file_buffered_write+0x5d/0x90
> > [<ffffffff8114aa66>] __generic_file_aio_write+0x1b6/0x3b0
> > [<ffffffff8114acc9>] generic_file_aio_write+0x69/0xd0
> > [<ffffffffa03ef225>] cifs_strict_writev+0xa5/0xd0 [cifs]
> > [<ffffffff811b2b95>] do_sync_readv_writev+0x65/0x90
> > [<ffffffff811b4312>] do_readv_writev+0xd2/0x2b0
> > [<ffffffff811b452c>] vfs_writev+0x3c/0x50
> > [<ffffffff811b46a2>] SyS_writev+0x52/0xc0
> > [<ffffffff8172976f>] tracesys+0xe1/0xe6
> > [<ffffffffffffffff>] 0xffffffffffffffff
> >=20
>=20
> Looks like it's stuck in dirty page throttling.
>=20
> What's likely happening is that you have a bunch of dirty pages when
> you go to pull the cable. At that point the system is trying to flush
> the pages so that this task can try to dirty more of them.
>=20
> What *should* happen (at least if this is a soft mount) is that the
> writeback of those pages eventually times out, the pages get their
> error bit set and eventually the write() syscalls go through.
>=20
> Have you tried stracing this and are able to tell that the write
> syscall never returns in this situation? Is it possible that the
> write() syscalls are returning, albeit slowly?

No, during several straces I've never seen a write() syscall returning =
after
pulling the cable.

Hagen

Shirish Pargaonkar

2014-01-11 02:45:41 UTC

Permalink

I think there are two things going on.

The flusher thread for the backing device is blocked trying to
reconnect. So no dirty pages are getting flushed out.

Meanwhile application thread just keeps writing to the page cache.

So both threads are oblivious to each other.

Should not cifs check, somehow, that an app is dirtying too many
pages and just error out the writes and mark error bits on all dirty
pages of this app?

On Thu, Jan 2, 2014 at 1:31 PM, Jeff Layton <jlayton-***@public.gmane.org> wrote:
> On Thu, 2 Jan 2014 17:04:27 +0100 (CET)
> mail654-***@public.gmane.org wrote:
>
>> > > write() from cifs kernel driver blocks when disconnecting the cifs server. The blocking call didn't return after 30 minutes. Client and server are connected via a switch and server's LAN cable is unplugged during the write call. I use kernel 3.11.8 and mounted without "hard" option.
>> > >
>> > > Is there a possibility for an non-blocking write() without using O_SYNC or "directio" mount option?
>> > >
>> > > Way to reproduce the scenario: Below is a sample program which calls write() in a loop. The error messages appear when unplugging the cable during this loop.
>> > >
>> > > Kind regards,
>> > > Hagen
>> > >
>> > > CIFS VFS: sends on sock ffff88003710c280 stuck for 15 seconds
>> > > CIFS VFS: Error -11 sending data on socket to server
>> > >
>> > > #include <fstream>
>> > > #include <iostream>
>> > > int main () {
>> > > const int size = 100000;
>> > > char buffer[size];
>> > > std::ofstream outfile("/mnt/new.bin",std::ofstream::binary);
>> > > if (!outfile.is_open())
>> > > {
>> > > return 1;
>> > > }
>> > > for (int idx=0; idx<10000 && outfile.good(); idx++)
>> > > {
>> > > outfile.write(buffer,size);
>> > > std::cout << "written, size=" << size << std::endl;
>> > > }
>> > > std::cout << "finished " << outfile.good() << std::endl;
>> > > outfile.close();
>> > > return 0;
>> > > }
>> >
>> > A hang of that length is unexpected. If you're able to reproduce this,
>> > can you get the stack from the task issuing the write at the time?
>> >
>> > $ cat /proc/<pid>/stack
>> >
>> > That might give us a clue as to what it's doing.
>>
>> [<ffffffff8170ab8c>] balance_dirty_pages.isra.19+0x4ac/0x55c
>> [<ffffffff8115455b>] balance_dirty_pages_ratelimited+0xeb/0x110
>> [<ffffffff81148f3a>] generic_perform_write+0x16a/0x210
>> [<ffffffff8114903d>] generic_file_buffered_write+0x5d/0x90
>> [<ffffffff8114aa66>] __generic_file_aio_write+0x1b6/0x3b0
>> [<ffffffff8114acc9>] generic_file_aio_write+0x69/0xd0
>> [<ffffffffa03ef225>] cifs_strict_writev+0xa5/0xd0 [cifs]
>> [<ffffffff811b2b95>] do_sync_readv_writev+0x65/0x90
>> [<ffffffff811b4312>] do_readv_writev+0xd2/0x2b0
>> [<ffffffff811b452c>] vfs_writev+0x3c/0x50
>> [<ffffffff811b46a2>] SyS_writev+0x52/0xc0
>> [<ffffffff8172976f>] tracesys+0xe1/0xe6
>> [<ffffffffffffffff>] 0xffffffffffffffff
>>
>
> Looks like it's stuck in dirty page throttling.
>
> What's likely happening is that you have a bunch of dirty pages when
> you go to pull the cable. At that point the system is trying to flush
> the pages so that this task can try to dirty more of them.
>
> What *should* happen (at least if this is a soft mount) is that the
> writeback of those pages eventually times out, the pages get their
> error bit set and eventually the write() syscalls go through.
>
> Have you tried stracing this and are able to tell that the write
> syscall never returns in this situation? Is it possible that the
> write() syscalls are returning, albeit slowly?
>
> --
> Jeff Layton <jlayton-***@public.gmane.org>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
> the body of a message to majordomo-***@public.gmane.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html

Shirish Pargaonkar

2014-01-19 04:40:15 UTC

Permalink

If an address_space flag has AS_EIO bit set, should the subsequent
writes should_fail/start_failing instead of them writing to the page cache?
Also not sure what happens to pages with PG_error bit set, probably
get discarded.

On Thu, Jan 2, 2014 at 1:31 PM, Jeff Layton <jlayton-***@public.gmane.org> wrote:
> On Thu, 2 Jan 2014 17:04:27 +0100 (CET)
> mail654-***@public.gmane.org wrote:
>
>> > > write() from cifs kernel driver blocks when disconnecting the cifs server. The blocking call didn't return after 30 minutes. Client and server are connected via a switch and server's LAN cable is unplugged during the write call. I use kernel 3.11.8 and mounted without "hard" option.
>> > >
>> > > Is there a possibility for an non-blocking write() without using O_SYNC or "directio" mount option?
>> > >
>> > > Way to reproduce the scenario: Below is a sample program which calls write() in a loop. The error messages appear when unplugging the cable during this loop.
>> > >
>> > > Kind regards,
>> > > Hagen
>> > >
>> > > CIFS VFS: sends on sock ffff88003710c280 stuck for 15 seconds
>> > > CIFS VFS: Error -11 sending data on socket to server
>> > >
>> > > #include <fstream>
>> > > #include <iostream>
>> > > int main () {
>> > > const int size = 100000;
>> > > char buffer[size];
>> > > std::ofstream outfile("/mnt/new.bin",std::ofstream::binary);
>> > > if (!outfile.is_open())
>> > > {
>> > > return 1;
>> > > }
>> > > for (int idx=0; idx<10000 && outfile.good(); idx++)
>> > > {
>> > > outfile.write(buffer,size);
>> > > std::cout << "written, size=" << size << std::endl;
>> > > }
>> > > std::cout << "finished " << outfile.good() << std::endl;
>> > > outfile.close();
>> > > return 0;
>> > > }
>> >
>> > A hang of that length is unexpected. If you're able to reproduce this,
>> > can you get the stack from the task issuing the write at the time?
>> >
>> > $ cat /proc/<pid>/stack
>> >
>> > That might give us a clue as to what it's doing.
>>
>> [<ffffffff8170ab8c>] balance_dirty_pages.isra.19+0x4ac/0x55c
>> [<ffffffff8115455b>] balance_dirty_pages_ratelimited+0xeb/0x110
>> [<ffffffff81148f3a>] generic_perform_write+0x16a/0x210
>> [<ffffffff8114903d>] generic_file_buffered_write+0x5d/0x90
>> [<ffffffff8114aa66>] __generic_file_aio_write+0x1b6/0x3b0
>> [<ffffffff8114acc9>] generic_file_aio_write+0x69/0xd0
>> [<ffffffffa03ef225>] cifs_strict_writev+0xa5/0xd0 [cifs]
>> [<ffffffff811b2b95>] do_sync_readv_writev+0x65/0x90
>> [<ffffffff811b4312>] do_readv_writev+0xd2/0x2b0
>> [<ffffffff811b452c>] vfs_writev+0x3c/0x50
>> [<ffffffff811b46a2>] SyS_writev+0x52/0xc0
>> [<ffffffff8172976f>] tracesys+0xe1/0xe6
>> [<ffffffffffffffff>] 0xffffffffffffffff
>>
>
> Looks like it's stuck in dirty page throttling.
>
> What's likely happening is that you have a bunch of dirty pages when
> you go to pull the cable. At that point the system is trying to flush
> the pages so that this task can try to dirty more of them.
>
> What *should* happen (at least if this is a soft mount) is that the
> writeback of those pages eventually times out, the pages get their
> error bit set and eventually the write() syscalls go through.
>
> Have you tried stracing this and are able to tell that the write
> syscall never returns in this situation? Is it possible that the
> write() syscalls are returning, albeit slowly?
>
> --
> Jeff Layton <jlayton-***@public.gmane.org>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
> the body of a message to majordomo-***@public.gmane.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html

Jeff Layton

2014-01-20 12:40:43 UTC

Permalink

On Sat, 18 Jan 2014 22:40:15 -0600
Shirish Pargaonkar <shirishpargaonkar-***@public.gmane.org> wrote:

> If an address_space flag has AS_EIO bit set, should the subsequent
> writes should_fail/start_failing instead of them writing to the page cache?
> Also not sure what happens to pages with PG_error bit set, probably
> get discarded.
>

Yeah, possibly.

What the NFS client does is switch to synchronous writes whenever there
is an issue. See nfs_need_sync_write(). Perhaps cifs should do
something similar?

> On Thu, Jan 2, 2014 at 1:31 PM, Jeff Layton <jlayton-***@public.gmane.org> wrote:
> > On Thu, 2 Jan 2014 17:04:27 +0100 (CET)
> > mail654-***@public.gmane.org wrote:
> >
> >> > > write() from cifs kernel driver blocks when disconnecting the cifs server. The blocking call didn't return after 30 minutes. Client and server are connected via a switch and server's LAN cable is unplugged during the write call. I use kernel 3.11.8 and mounted without "hard" option.
> >> > >
> >> > > Is there a possibility for an non-blocking write() without using O_SYNC or "directio" mount option?
> >> > >
> >> > > Way to reproduce the scenario: Below is a sample program which calls write() in a loop. The error messages appear when unplugging the cable during this loop.
> >> > >
> >> > > Kind regards,
> >> > > Hagen
> >> > >
> >> > > CIFS VFS: sends on sock ffff88003710c280 stuck for 15 seconds
> >> > > CIFS VFS: Error -11 sending data on socket to server
> >> > >
> >> > > #include <fstream>
> >> > > #include <iostream>
> >> > > int main () {
> >> > > const int size = 100000;
> >> > > char buffer[size];
> >> > > std::ofstream outfile("/mnt/new.bin",std::ofstream::binary);
> >> > > if (!outfile.is_open())
> >> > > {
> >> > > return 1;
> >> > > }
> >> > > for (int idx=0; idx<10000 && outfile.good(); idx++)
> >> > > {
> >> > > outfile.write(buffer,size);
> >> > > std::cout << "written, size=" << size << std::endl;
> >> > > }
> >> > > std::cout << "finished " << outfile.good() << std::endl;
> >> > > outfile.close();
> >> > > return 0;
> >> > > }
> >> >
> >> > A hang of that length is unexpected. If you're able to reproduce this,
> >> > can you get the stack from the task issuing the write at the time?
> >> >
> >> > $ cat /proc/<pid>/stack
> >> >
> >> > That might give us a clue as to what it's doing.
> >>
> >> [<ffffffff8170ab8c>] balance_dirty_pages.isra.19+0x4ac/0x55c
> >> [<ffffffff8115455b>] balance_dirty_pages_ratelimited+0xeb/0x110
> >> [<ffffffff81148f3a>] generic_perform_write+0x16a/0x210
> >> [<ffffffff8114903d>] generic_file_buffered_write+0x5d/0x90
> >> [<ffffffff8114aa66>] __generic_file_aio_write+0x1b6/0x3b0
> >> [<ffffffff8114acc9>] generic_file_aio_write+0x69/0xd0
> >> [<ffffffffa03ef225>] cifs_strict_writev+0xa5/0xd0 [cifs]
> >> [<ffffffff811b2b95>] do_sync_readv_writev+0x65/0x90
> >> [<ffffffff811b4312>] do_readv_writev+0xd2/0x2b0
> >> [<ffffffff811b452c>] vfs_writev+0x3c/0x50
> >> [<ffffffff811b46a2>] SyS_writev+0x52/0xc0
> >> [<ffffffff8172976f>] tracesys+0xe1/0xe6
> >> [<ffffffffffffffff>] 0xffffffffffffffff
> >>
> >
> > Looks like it's stuck in dirty page throttling.
> >
> > What's likely happening is that you have a bunch of dirty pages when
> > you go to pull the cable. At that point the system is trying to flush
> > the pages so that this task can try to dirty more of them.
> >
> > What *should* happen (at least if this is a soft mount) is that the
> > writeback of those pages eventually times out, the pages get their
> > error bit set and eventually the write() syscalls go through.
> >
> > Have you tried stracing this and are able to tell that the write
> > syscall never returns in this situation? Is it possible that the
> > write() syscalls are returning, albeit slowly?
> >
> > --
> > Jeff Layton <jlayton-***@public.gmane.org>
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
> > the body of a message to majordomo-***@public.gmane.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
> the body of a message to majordomo-***@public.gmane.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html

--
Jeff Layton <jlayton-***@public.gmane.org>