Skip to content

basis_checker_columns() from pg/macros/MatrixCheckers.pl is not as dependable as would be desired #409

@taniwallach

Description

@taniwallach

I have been coding some Linear Algebra problems, and started to make use of basis_checker_columns() from pg/macros/MatrixCheckers.pl. I have found that it is not as dependable as would be desired. It also gives minimal feedback and does not give partial credit.

In order to overcome these, I found that I had need for a reasonably dependable rank() function for MathObject matrices.

There is a rudimentary rank.pl at https://github.com/openwebwork/webwork-open-problem-library/blob/master/OpenProblemLibrary/NAU/setLinearAlgebra/rank.pl which seems to work reasonably well for matrices with integer or fractions as the entries. It was not intended to handle uglier numbers well. There is also the `$M->order_LR' function, but as explained by @dpvc in a forum post  that misbehaves due to doing exact comparisons to 0 when floating point linear algebra algorithms need to have some tolerance to them. (Matlab's rank function claims to be an approximation to the rank for similar reasons, and takes a tolerance as an optional second argument.)

So I coded a new rank.pl based on @dvpc's suggestions, and with additional changes so it can handle non-square matrices. As it used fuzzy comparisons it is not 100% accurate, but seems pretty reasonable in my testing so far.

Using the rank() function, I could redesign basis_checker_columns() to be more reliable (in my testing).

Below are

  • rank.pl,
  • CustomBasisChecker1.pl
    • (with both the original basis_checker_columns() and my proposed basis_checker_columns_tani() in one file, so they can be easily compared by just changing the value of checker), and
  • a test problem with some discussion of what problems I encountered with the original basis_checker_columns() and some of the limitations of the proposed alternative: test_CustomBasisChecker1.pg.

Some additional testing of the rank function from rank.pl would be helpful, as well as a review of basis_checker_columns_tani(). The checker can either replace the old

If and when the proposed alternative is considered good enough, similar changes can be made to the other checkers in pg/macros/MatrixCheckers.pl and provided either as replacements or alternatives to the existing ones.


rank.pl follows:

# Coded by Nathan Wallach, May 2019
# based on Davide Cervone's recommendation from
# http://webwork.maa.org/moodle/mod/forum/discuss.php?d=3194
# as the current $Matrix->order_LR does not use fuzzy comparisons
# so does not give good results.

sub rank { # It assumes that a MathObject matrix object is sent
  my $MM1 = shift;

  if ( $MM1->class ne "Matrix") {
    return -1; # Not a matrix
  }

  # For the case it is square
  my $MM = $MM1;

  my ($Rrows,$Rcols) = $MM1->dimensions;

  if ( ( $Rrows <= 0 ) || ( $Rcols <= 0 ) ) {
    return -1; # Not a matrix
  }

  if ( $Rrows < $Rcols ) {
    # pad to make it square
    my @rows = ();
    my $i = 1;
    for ( $i = 1 ; $i <= $Rrows ; $i++ ) {
      push( @rows, $MM1->row($i) );
    }
    while ( $i <= $Rrows ) {
      # pad with zero rows
      push( @rows, ( 0 * $MM1->row(1) ) );
      $i++;
    }
    $MM = Matrix( @rows );
  } elsif ( $Rrows > $Rcols ) {
    return( rank( $MM1->transpose ) );
  }

  # Davide's approach from http://webwork.maa.org/moodle/mod/forum/discuss.php?d=3194
  my $tempR = $MM->R;
  ($Rrows,$Rcols) = $tempR->dimensions;
  my $rank;

  for ( $rank = $Rrows ; $rank >= 1; $rank-- ) {
        last if ( $tempR->element($rank,$rank) != Real("0") );
  }
  return( $rank );
}

1;

CustomBasisChecker1.pl follows:

# Proposed redesign of basis_checker_columns() from pg/macros/MatrixCheckers.pl
# to overcome problems with the behavior of that function.

# The original code of pg/macros/MatrixCheckers.pl is 
# by Paul Pearson, Hope College, Department of Mathematics
# and was the original basis and the inspiration for much
# of that is in the proposed new checker.

# For not the proposed new checker is called basis_checker_columns_tani()
# and the local version of basis_checker_columns() has a bit of debugging
# code added.\

sub _MatrixCheckers_init {}; # don't reload this file

loadMacros("MathObjects.pl");

# ============================================================

sub concatenate_columns_into_matrix {

  my @c = @_;
  my @temp = ();
  for my $column (@c) {
    push(@temp,Matrix($column)->transpose->row(1));
  }
  return Matrix(\@temp)->transpose;

}

# The original code more directly based on Paul Pearson's original code 
# was having some difficulty handling certain incorrect answers.

# Space where I had problems had correct basis as (1,0,3,4) and (0,1,-1,-1)
# The answer (1,2,1,2) and (sqrt(253),2sqrt(253),sqrt(253),2sqrt(253)+t).
#    The second of these vectors has a shift "t" in the last coordinate
#    from being sqrt(253) times the first vector.
# For t=0 and t=0.00000001 the original code saw the vectors as dependent.
# For t=0.1,0.01 the original code marked the answer as incorrect (it could 
#    tell that the second vector was not in the space.)
# However, for t in { 0.001, 0.0001, 0.00001, 0.000001, 0.0000001 }
#    the answer was being accepted as correct when it is NOT.
# The issues seem to relate to be a result of too much "tolerance" in some
#    calculations.

# The new version does not accept the incorrect answers accepted by the prior
# version of the code. For t in { 0.1, 0.01, ..., 0.0000000001 } the second
# vector is recognized as not in the space. However, for a very small values of
# t, such as t=0.00000000001 the new code sees the 2 vectors in the answer as
# dependent (as if t were really 0).

sub basis_checker_columns_tani {

      my ( $correct, $student, $self, $answerHash ) = @_;
      my @c = @{$correct};
      my @s = @{$student};

      my $dimSpace = scalar( @c );
      my $numStudV = scalar( @s );

      # Most of the answer checking is done on integers 
      # or on decimals like 0.24381729, so we will set the
      # tolerance accordingly in a local context.

      # the tolerance was set to be much smaller than in the old code

      my $context = Context()->copy;
      $context->flags->set(
        tolerance => 0.0000001,
        tolType => "absolute",
      );

      return 0 if ( $numStudV < $dimSpace ); # count the number of vector inputs

      my $C = concatenate_columns_into_matrix(@c);
      my $S = concatenate_columns_into_matrix(@s);

      # Put $C and $S into the local context so that
      # all of the computations that follow will also be in
      # the local context.
      $C = Matrix($context,$C);
      $S = Matrix($context,$S);

      # $self->{ans}[0]->{ans_message} .= "C = $C$BR";
      # $self->{ans}[0]->{ans_message} .= "S = $S$BR";

      $rankC = rank($C);
      $rankS = rank($S);

      #  Check that the professor's vectors are, in fact, linearly independent.
      # The original approach based on
      #     Theorem: A^T A is invertible if and only if A has linearly independent columns.
      # was more likely to ignore small shifts, as the determinant would end up very small
      # We now use the improved rank.pl to test this.

      warn "Correct answer is a linearly dependent set." if ( $rankC < $dimSpace );

      my @notInSpaceMessage = (  );
      my @wasInSpace = ();
      my $notInSpace = 0;

      # Check each student vector to see if it is in the required space.
      my $j;
      for( $j = 0 ; $j < $numStudV ; $j++ ) {
        my @c1 = ( @{$correct} ); 
        push( @c1, $s[$j] );

        my $C1 = concatenate_columns_into_matrix(@c1);

        if ( rank( $C1 ) > $dimSpace ) {
          my $tmp1 = $j + 1;
	  $notInSpace++;
          push( @notInSpaceMessage, "Vector number $tmp1 is not in the space.$BR" );
        } elsif ( ! $s[$j]->isZero ) {
          push( @wasInSpace, $s[$j] );
        }
      }

      # How many independent were in the space
      my $goodCount = 0;
      my $secondaryDependenceTest1 = 0;
      if ( @wasInSpace ) { 
        my $C1 = concatenate_columns_into_matrix( @wasInSpace );
        $goodCount = rank( $C1 );

        # Add a second test for a linear dependence of this part of the students answers, 
        # in case the rank code misbehaves. This is a revised version of the test originally used.

        # It was needed for the case t=0.00000000002 in the test example discussed above
        my $dd = (($C1->transpose) * $C1)->det;
        if ( ( $dd == Real(0) ) && ($goodCount == scalar( @wasInSpace ) ) ) {
          $secondaryDependenceTest1 = 1;
          # warn "secondaryDependenceTest1 turned on";
        }
      }

      if ( ( $goodCount == $dimSpace ) && ( $goodCount == $numStudV ) && ($secondaryDependenceTest1 == 0 ) ) {
        # There are the correct number of independent vectors from the required space, and no others
        return 1;
      }

      my $depWarn = "";
      if ( $secondaryDependenceTest1 == 1 ) {
        # The value of $goodCount was WRONG. Decrease it by one, and add a warning if the result is still > 1
        if ( ( --$goodCount ) > 1 ) {
          $depWarn = "The software may have an incorrect count of the number of indepenent vectors from the space in your answer.$BR";
        }
      }

      if ( $goodCount == 1 ) {
        unshift( @notInSpaceMessage, "Your answer contains only one independent vector from the space.$BR$depWarn" );
      } elsif ( $goodCount > 1 ) {
        unshift( @notInSpaceMessage, "Your answer contains $goodCount independent vectors from the space.$BR$depWarn" );
      }

      # Add a second test for a linear dependence of ALL of the students answers,
      # in case the rank code misbehaves. This is a revised version of the test originally used.
      my $secondaryDependenceTest2 = 0;

      $dd = (($S->transpose) * $S)->det;
      # To debug
      # warn "The determinant tested against zero to check linearly dependence had the value $dd";

      if ( $dd == Real(0) ) {
        $secondaryDependenceTest2 = 1;
        # warn "secondaryDependenceTest2 turned on";
      }

      #  Check that the student's vectors are linearly independent
      if ( ( $rankS < $numStudV ) || ( ( $secondaryDependenceTest1 + $secondaryDependenceTest2 ) > 0 ) ) {
	# There is a linear dependence among the students answers.
	# Sometimes the detection of linear dependence conflicts with that of a vector not in the space.
        # So give the message only if nothing was found to be outside the space.
        if ( $notInSpace == 0 ) {
          unshift( @notInSpaceMessage, "Your vectors are linearly dependent.$BR");
        }
      } 
      # else {
      #    The students vectors are linearly independent.
      # }

      $self->{ans}[0]->{ans_message} = join("", @notInSpaceMessage );

      my $score = ( $goodCount / $dimSpace ) - ( ( $notInSpace / $dimSpace / 4 ) ) ;
      $score = 0 if ( $score < 0 ); # in case penalties exceed credit for good vectors
      # $self->{ans}[0]->{ans_message} .= "$BR$BR score = $score";

      # Scale due to MultiAnswer issue with fractional scores
      $score *= ( $dimSpace )/( -1 + $dimSpace );

      return $score;
}

##########################################

sub basis_checker_columns {

      my ( $correct, $student, $answerHash ) = @_;
      my @c = @{$correct};
      my @s = @{$student};

      # Most of the answer checking is done on integers
      # or on decimals like 0.24381729, so we will set the
      # tolerance accordingly in a local context.
      my $context = Context()->copy;
      $context->flags->set(
        tolerance => 0.001,
        tolType => "absolute",
      );

      return 0 if scalar(@s) < scalar(@c);  # count the number of vector inputs

      my $C = concatenate_columns_into_matrix(@c);
      my $S = concatenate_columns_into_matrix(@s);

      # Put $C and $S into the local context so that
      # all of the computations that follow will also be in
      # the local context.
      $C = Matrix($context,$C);
      $S = Matrix($context,$S);

      #  Theorem: A^T A is invertible if and only if A has linearly independent columns.

      #  Check that the professor's vectors are, in fact, linearly independent.
      $CTC = ($C->transpose) * $C;
      warn "Correct answer is a linearly dependent set." if ($CTC->det == 0);

      #  Check that the student's vectors are linearly independent
      if ( (($S->transpose) * $S)->det == 0) {
        Value->Error("Your vectors are linearly dependent");
        return 0;
      }

      # Next 2 lines added for local testing... 
      my $dd = (($S->transpose) * $S)->det;
      # warn "The determinant tested against zero to check linearly dependence had the value $dd";

      # S = student, C = correct, X = change of basis matrix
      # Solve S = CX for X using (C^T C)^{-1} C^T S = X.
      $X = ($CTC->inverse) * (($C->transpose) * $S);
      return $S == $C * $X;

}


#############################################


1;

test_CustomBasisChecker1.pg follows:

# Test problem to test/debug issues I had with basis_checker_columns()
# from pg/macros/MatrixCheckers.pl. 

DOCUMENT();        # This should be the first executable line in the problem.

loadMacros(
  "PGstandard.pl",
  "MathObjects.pl",
  "parserMultiAnswer.pl",
  #"MatrixCheckers.pl",
  "rank.pl",
  "CustomBasisChecker1.pl",	# includes a slightly modified version of basis_checker_columns 
				# and a redesigned basis_checker_columns_tani which depends on
				# the new rank.pl macro
);

TEXT(beginproblem());

$showPartialCorrectAnswers = 1;
$showPartialCredit = 1;

Context('Matrix');

$vec1 = Matrix([[ 1,0,3,4 ]])->transpose;
$vec2 = Matrix([[ 0,1,-1,-1 ]])->transpose;

$vec3 = ($vec1 + 2*$vec2)->transpose;

$shifta = Matrix([[ 0,0,0,0.00000000001 ]]);
$vec4a = ( sqrt(253) * $vec3 ) + $shifta ;

$r34a = rank( Matrix( [
  $vec3->row(1),
  $vec4a->row(1)
]));

$shiftb = Matrix([[ 0,0,0,0.0000000001 ]]);
$vec4b = ( sqrt(253) * $vec3 ) + $shiftb;

$r34b = rank( Matrix( [
  $vec3->row(1),
  $vec4b->row(1)
]));

$multians1 = MultiAnswer($vec1, $vec2)->with(
  singleResult => 1,
  separator => ',',
  tex_separator => ',',
  allowBlankAnswers=>0,
#  checker => ~~&basis_checker_columns_tani,
  checker => ~~&basis_checker_columns, # Had issues fixed by the _tani version
);

Context()->texStrings;
BEGIN_TEXT

Try this problem with both of the options for the checker function.
$PAR

Test answers where the first element is \( $vec3 \) and the second one is \( \sqrt{253} $vec3 \)
plus a small change to one coordinate. I did testing with changes to the last coordinate.
$PAR

$HR

Find a basis \( \mathcal{B} \) of \( \text{Span}\left\lbrace  $vec1, $vec2 \right\rbrace \).
$BR


$BCENTER
\( \mathcal{B} = \left\lbrace \;\; \rule{0pt}{50pt} \right. \;\; b_1 = \)
\{ $multians1->ans_array(1) \}
\( , \;\; b_2 = \)
\{ $multians1->ans_array(25) \}
\( \left. \;\; \rule{0pt}{50pt} \;\; \right\rbrace \)
$ECENTER
$PAR
$HR

When the original basis_checker_columns grader is being used:
$BR
When \( 0.0000001 \) or even \( 0.0022 \) is added to the last coordinate, the original basis_checker_columns
will ${BBOLD}incorrectly${EBOLD} treat the answer as ${BBOLD}correct${EBOLD}.
$PAR
When \( 0.0000002 \) is added the last coordinate, the vectors are reported as being linearly dependent.
$PAR

$HR

When the proposed replacement basis_checker_columns_tani grader is being used:
$BR
When a shift of \( 0.0000000001 \) or more is added to the last coordinate, the new code
will detect that the second vector is not in the space. It then gives partial credit for the
good vector less a penalty for the bad one.$BR

When a shift of \( 0.00000000001 \) is added to the last coordinate, the new code
will ignore the small change and treat the second vector as being a multiple of the
first vector and report there being a linear dependence among the vectors of the answer.
$BR

The code will not report a linear dependence among the vectors of the student's answer
when it has detected a vector not being in the space. If both messages had been permitted,
sometimes both would have been given in a contradictory way. An addition of \( 0.0000000001 \)
to the last coordinate was triggering such behavior.

$HR

The new code depends on a new rank function, and that function uses fuzzy comparisions of the
diagonal entries in the R from the LR decomposition, and this leads it to behaving reasonably
well but not perfectly.
$BR
The rank of the matrix whose rows are \( $vec3 \) and
\( (\sqrt{253} $vec3 ) + $shiftb \) will be computed to be \( $r34b\)  (comes out \(2\) on my PC).
$BR
The rank of the matrix whose rows are \( $vec3 \) and
\( (\sqrt{253} $vec3 ) + $shifta \) will be computed to be \( $r34a\)  (comes out \(1\) on my PC).


END_TEXT
Context()->normalStrings;

ANS( $multians1->cmp() );

ENDDOCUMENT();        # This should be the last executable line in the problem.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions