[S390] fix tlb flushing vs. concurrent /proc accesses

The tlb flushing code uses the mm_users field of the mm_struct to
decide if each page table entry needs to be flushed individually with
IPTE or if a global flush for the mm_struct is sufficient after all page
table updates have been done. The comment for mm_users says "How many
users with user space?" but the /proc code increases mm_users after it
found the process structure by pid without creating a new user process.
Which makes mm_users useless for the decision between the two tlb
flusing methods. The current code can be confused to not flush tlb
entries by a concurrent access to /proc files if e.g. a fork is in
progres. The solution for this problem is to make the tlb flushing
logic independent from the mm_users field.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
diff --git a/arch/s390/include/asm/mmu_context.h b/arch/s390/include/asm/mmu_context.h
index 976e273..a6f0e7c 100644
--- a/arch/s390/include/asm/mmu_context.h
+++ b/arch/s390/include/asm/mmu_context.h
@@ -11,11 +11,14 @@
 
 #include <asm/pgalloc.h>
 #include <asm/uaccess.h>
+#include <asm/tlbflush.h>
 #include <asm-generic/mm_hooks.h>
 
 static inline int init_new_context(struct task_struct *tsk,
 				   struct mm_struct *mm)
 {
+	atomic_set(&mm->context.attach_count, 0);
+	mm->context.flush_mm = 0;
 	mm->context.asce_bits = _ASCE_TABLE_LENGTH | _ASCE_USER_BITS;
 #ifdef CONFIG_64BIT
 	mm->context.asce_bits |= _ASCE_TYPE_REGION3;
@@ -76,6 +79,12 @@
 {
 	cpumask_set_cpu(smp_processor_id(), mm_cpumask(next));
 	update_mm(next, tsk);
+	atomic_dec(&prev->context.attach_count);
+	WARN_ON(atomic_read(&prev->context.attach_count) < 0);
+	atomic_inc(&next->context.attach_count);
+	/* Check for TLBs not flushed yet */
+	if (next->context.flush_mm)
+		__tlb_flush_mm(next);
 }
 
 #define enter_lazy_tlb(mm,tsk)	do { } while (0)