So the idea is to model reasoning as a preference relation over sets of propositions. Given sets of propositions S1 and S2, we might have the relation S1 < S2, which we can read as "S2 is preferred over S1." And what this means operationally is that if our entire belief set was initially S1, we would be willing to replace our (entire) belief set S1 with S2.
It is convenient to set aside < for now and define things with ≤ since that is the language of partial orders. We can read S1 ≤ S2 as "S2 is acceptable from S1." S1 ≤ S2 means S1 < S2 or S1 = S2.
The relation ≤ should be a partial order. It should be reflexive: S ≤ S for all sets of propositions S. It should be transitive: S1 ≤ S2 and S2 ≤ S3 should imply S1 ≤ S3. It should be antisymmetric: if S1 ≤ S2 and S2 ≤ S1 then S1 = S2.
There can be sets of propositions S1 and S2 where S1 ≰ S2 and S2 ≰ S1, and in this case neither S1 can be concluded from S2, nor S2 concluded from S1.
The preference relation S1 < S2 can then be defined as S1 ≤ S2 and S1 ≠ S2.
For example: we may prefer {A, A->B, B} over {A, A->B} alone, so {A, A->B} < {A, A->B, B}, and therefore from {A, A->B} we would prefer to add the proposition B and reach {A, A->B, B}. This is (a particular example of) modus ponens.
In general, for a formal deductive logic, a reasonable preference relation could have S1 < S2 iff (1) every formula in S2 is derivable from the conjunction of formulas in S1, and (2) S2 is a superset of S1. Condition (2) means we are never "giving up" any theorems as we reason, only adding to the set. This is not the only possible preference relation for the logic, but it's one reasonable choice.
Since this is preference rather than deduction, it can account for non-logical or empirical reasoning, such as preferring one hypothesis to explain the data over another. This would be represented as {D, H1} < {D, H2} where D is the data, H1 is one hypothesis, and H2 is a preferred hypothesis.
It would be necessary to mark propositions with their provenance, so that we are allowed to throw out and change hypotheses as above, but not allowed to throw out and change the data.
Reasoning would consist of moving through the preference graph from less-preferred to more-preferred sets of propositions.
A "truth" from a starting proposition set S1, would be defined as any proposition P, such that there is a proposition set S2 with S1 ≤ S2, and if S2 ≤ S3, then P ∈ S3.
In other words, in our ascent up the preference graph from S1, if we are able to reach a point where we conclude P, and where from that point all the nodes above us also have P, then P was a truth of our starting point S1.
A truth is a proposition that holds "in the limit" as we ascend the preference graph.
We might want to require that the set of truths as we ascend should be unchanged. In other words, if P is a truth of S1, and S1 < S2, then we demand P also be a truth of S2. This restricts the possible structure of the preference graph.
We may also wish to talk about what beliefs a person would immediately recognize as preferable from their current belief set, as opposed to what would be preferable after a million inference steps. This can be represented with a relation <ᵢ where the i stands for "immediate," and A <ᵢ B means "B is immediately preferred to A." The set of all edges (A, B) where A <ᵢ B would form a directed acyclic graph, which could be extended to A < B by taking the transitive closure. In a deductive logic, A <ᵢ B would hold provided that B = A ∪ {b} where b is a proposition, not previously in A, that can be obtained in one step by applying an inference rule to some of the propositions in A.
Are there any aspects of reasoning that can't be captured by a system like this?